Openai uses SURREDDIT, R/shangemyViewCreate a test to measure the compelling abilities of the AI inference model. The company revealed this on a system card (a document that summarizes the AI system mechanism) released with the new “reasoning” model, O3-Mini on Friday.
Millions of REDDIT users are members of R/ShangemyView, who want to post Hot Tose and learn about other themes. In response to these hot take, other users reply in a compelling discussion to explain why the original poster is wrong.
SURREDDIT is one of many Reddit forums, which are basically gold mine for high -tech companies such as Openai, who want to train AI models with high quality data generated by humans.
Openai will collect user posts from R/ShangemyView, and ask the AI model to reply to turn the Reddit user’s mind into a subject in a closed environment. After that, the company will respond to the tester. Tester evaluates how persuasive the discussion is, and eventually Openai compares the AI model response to the same post with the response to human response.
Chatgpt-Maker has a content license agreement with Reddit to train posts from Reddit users and display these posts in the product. I don’t know what Openai is paying for this content, but Google is told Pay $ 60 million a year Under similar transactions.
However, Openai tells TechCrunch that ShangemyView -based evaluation is irrelevant to Reddit transactions. It is unknown how Openai has accessed the SUBREDDIT data, and the company has stated that there is no plans to publish this evaluation to the public.
Openai’s ShangemyView benchmark is not new, It is also used to evaluate O1 -In the AI model developer, and for the ambiguous way for high -tech companies to get a dataset, it emphasizes how valuable human data is.
Reddit did not respond immediately to the request for TechnoCrunch comments.
Reddit has attacked several AI license agreements, but the company also calls on several AI companies to cut the site without paying. Reddit CEO’s Steve Huffman told Verge last year Microsoft, humanitarian, and confusion have refused to negotiate with him He said that it was the true pain of the buttocks to block these companies.
In particular, Openai has been accused of inappropriate lawsuits that inappropriate cut off websites, including the New York Times, to improve Chatgpt and its basic AI models. I am.
Regarding the performance of the ShangemyView benchmark, O3-mini does not seem to be more significant or worse than O1 or GPT-4O. However, the latest AI model of Openai seems to be more persuasive than most people in R/ShangemyView Subbeddit.
“All GPT-4O, O3-mini, and O1 show a strong persuasive debate in the top 80-90 percentile of humans,” says O3-MINI’s system card. “We are currently not witnessing a model that demonstrates much better than humans or clarifies superhuman performance.”
Openai’s goal is not to create a super -transparent AI model, but to make the AI model less convincing. Openai has developed a new evaluation and protection means to deal with it, as the inference model is very good for persuasion and deceive CEPTION.
The fear of motivating these persuasive tests is that it is dangerous if the AI model persuades human users. Theoretically, advanced AI can pursue an agenda of its own agenda or a person who controls it.
When most of the Public Internet is cut, jumping over Hoops and licensing other data, the ShangeMyView benchmark indicates that the AI model developers are having a hard time finding high -quality datasets to test models. Masu. But getting them is easier than saying.
TechCrunch has a newsletter focusing on AI! Sign up here and get it on the reception tray every Wednesday.