OpenAI trained its AI models on YouTube data: Report

103

Sam Altman-run OpenAI, which is now backed by Microsoft, reportedly trained its artificial intelligence (AI) models on Google-owned YouTube by scrapping its data.

According to a report in The Information, OpenAI “has secretly used data from the site (YouTube) to train some of its artificial intelligence models”.

YouTube is the single biggest and richest source of imagery, audio and text transcripts on the web.

While Google researchers have been using YouTube to develop its next large-language model called Gemini, “the value of YouTube hasn’t been lost on OpenAI, either”.

YouTube’s terms of service ban using content for anything other than “personal, non-commercial use.”

However, it’s an open secret in the AI industry that all are scraping the web and OpenAI reportedly “scraped” YouTube data to train its AI models which are now a rage in the world.

OpenAI did not immediately comment on the report.

OpenAI has just released the new versions of its text-generating AI models GPT-3.5-turbo and GPT-4, with a capability called function calling.

With the function calling capability, developers can create chatbots that answer questions by calling external tools (like ChatGPT Plugins).

Meanwhile, Google last month upgraded its Bard chatbot with a new machine-learning model that can better understand conversational language and compete with OpenAI’s ChatGPT.

The tech giant has introduced new improvements in its AI chatbot Bard, including better logic and reasoning skills.

Bard now uses a new technique called “implicit code execution” to recognise computational prompts and run code in the background, the tech giant said in a blogpost on Wednesday.

As a result, it can respond to string manipulation, coding questions and mathematical operations more correctly.

20230615-111602

LEAVE A REPLY

Please enter your comment!
Please enter your name here