In a new paper, three Google DeepMind researchers have discovered something about AI models that could thwart employers’ plans for more advanced AI.
Written by DeepMind researchers Steve Yadlowsky, Lyric Doshi, and Nilesh Tripuraneni. Papers not yet peer-reviewed We’ll elaborate on what many of us have observed in recent months: that today’s AI models aren’t very good at producing output beyond their training data.
This paper focuses on OpenAI’s GPT-2 (which is, admittedly, two versions older than the newer one) and so-called transformer models. A transformer model, as the name suggests, is an AI model that transforms one type of input into another type of input. Type of output.
The “T” in OpenAI’s GPT architecture stands for “Transformer,” and this type of model was first theorized in a 2017 paper, by a group of researchers, including other DeepMind employees.All you need is attentiveness” is believed to have the potential to lead to artificial general intelligence (AGI), or human-level AI. As the reasoning progressesthis is a type of system that allows machines to do intuitive “thinking” just like we do.
The hopes are high for Transformers, and an AI model that can leap beyond its training data will indeed be amazing, but at least when it comes to GPT-2, it still leaves a lot to be desired.
“We demonstrate different failure modes and poor generalization of transformers, even for simple extrapolation tasks, when presented with tasks or functions that lie outside the domain of the pre-training data.” Yadlowsky, Doshi, Tripuranemi explains.
In other words, if a transformer model has not been trained with data related to what is being requested, it will probably not be able to perform the task at hand, even if it is simple.
However, given the circumstances, it’s not unreasonable to think otherwise. Seemingly huge training dataset It is used to build OpenAI’s GPT Large-Scale Language Model (LLM), which is certainly very impressive. Just like the kids you send to the most expensive and top-rated preschools, these models are packed with so much knowledge that you don’t really get much out of it. I haven’t done it I am trained in
Of course, there are some caveats. GPT-2 is old history at this point, and AI may have some new properties. sufficient Once you create your training data, you start connecting that information to the outside world. Or perhaps smart researchers will devise new approaches that go beyond the limitations of the current paradigm.
Still, the gist of the findings is sobering for even the most enthusiastic AI hype. The core of this paper is that today’s best approaches are still only agile on topics for which they have been thoroughly trained, meaning that, at least for now, AI is only as good as the human expertise doing the job. It seems to be argued that this is limited to cases where the It was used to train it.
Since ChatGPT, built on the GPT framework, was released last year, pragmatists have been advising people to: Temper expectations for AI and suspend AGI estimation. But being careful is less appealing. CEO seeing dollar signs Some fortune tellers claim AI’s perceptive powers. In the process, even the most erudite researchers seem to have developed different ideas about how smart the current best his LLM actually is. Belief in AI At the moment, leaps in thought that separate humans from machines are becoming possible.
These warnings, now backed up by research, don’t seem to be falling on deaf ears with OpenAI CEO Sam Altman and Microsoft CEO Satya Nadella. They’re advertising this week to investors:Build AGI together. ”
To be sure, Google DeepMind is not exempt from this kind of prophecy either.
In a podcast interview last month, DeepMind co-founder Shane Legg said he believes there is a 50 percent chance of achieving AGI by 2028, which is more than he expected in more than a decade. It’s a belief that I have.
“There’s nothing that can do that, because I think that’s the nature of it,” Legg told tech podcaster Dwarkesh Patel. “This is about general intelligence. So I have to check it out.” [an AI system] You could do so many different things and there were no gaps. ”
But given that three DeepMind employees found that untrained transformer models seemed incapable of doing much, this coin flip might not do their bosses any favors. It seems that there is.
AGI details: OpenAI’s lead researcher worries that AGI will treat us like animals