summary: Researchers have evolved the ability of neural networks to make constructive generalizations, similar to how humans understand and extend new concepts.
This new technique, named meta-learning for compositionality (MLC), challenges decades-old skepticism about the power of artificial neural networks. MLC involves training the network through episodic learning to enhance generalization skills.
Remarkably, MLC matched or exceeded human performance on a variety of tasks.
Important facts:
- MLC techniques focus on episodic training of neural networks, allowing them to better generalize new concepts in a constructive manner.
- In tasks involving novel word combinations, MLCs performed as well as or better than human participants.
- Popular models such as ChatGPT and GPT-4, despite their advances, have challenges in generalizing these types of configurations, but MLC could be a solution to enhance their capabilities. there is.
sauce: new york university
Humans have the ability to learn new concepts and quickly use them to understand related uses of that concept. Once children know how to “skip”, they understand the meaning of “he skips the room twice” or “skips with his hands.” Up. ”
But can machines do this kind of thinking? In the late 1980s, philosophers and cognitive scientists Jerry Fodor and Zenon Pylyshyn developed artificial neural networks (artificial neural networks, the driving force behind artificial intelligence and machine learning). claimed that the engine that does not have the ability to make these connections, known as “synthetic generalizations”.
However, in the decades since then, scientists have developed ways to incorporate this ability into neural networks and related technologies, with mixed success, thereby keeping this decades-long debate alive. Ta.
Researchers from New York University and Spain’s Pompeu Fabra University report in the journal NatureThis improves the functionality of tools such as ChatGPT and allows for compositional generalization.
This technique, Meta-Learning for Compositionality (MLC), outperforms existing approaches and is comparable to, and in some cases better than, human performance.
MLC focuses on training neural networks, the engine that powers ChatGPT and related technologies for speech recognition and natural language processing, to better perform constructive generalization through practice.
Developers of existing systems involving large-scale language models have either expected constructive generalization to emerge from standard training methods, or have developed specialized architectures to achieve these capabilities. . In contrast, the authors point out that MLC shows that by explicitly practicing these skills, systems can unlock new power.
“For 35 years, researchers in cognitive science, artificial intelligence, linguistics, and philosophy have debated whether neural networks can achieve systematic generalization like humans,” New York University Data Science Center and Brenden Lake, assistant professor in the Department of Psychology. He is one of the authors of the paper.
“We have shown for the first time that a general neural network can imitate or even exceed human systematic generalization in direct comparison.”
In exploring the possibility of enhancing configural learning in neural networks, researchers created MLC. This is a new learning procedure in which the neural network is continuously updated and skills are improved through a series of episodes.
During the episode, the MLC receives a new word and is asked to use it compositionally. For example, you can take the word “jump” and be asked to create new word combinations, such as “jump twice” or “jump right twice.” MLC then receives new episodes featuring different words, improving the network’s composition skills each time.
To test the effectiveness of MLC, Lake, co-director of the Minds, Brains, Machines Initiative at New York University, and Marco, a researcher at the Institute for Advanced Study of Catalonia and professor at the Department of Translation and Language Sciences at Pompeu Fabra University, Mr. Baroni assisted. The university conducted a series of experiments with human participants, the same tasks that MLC performed.
Furthermore, rather than learning the meanings of actual words that humans already know, humans also learn the meanings of meaningless terms defined by researchers (such as “zup” and “dax”) and learn how to apply them. I had to know. In different ways.
MLC performed comparable to, and in some cases better than, human participants. MLC and People showed better performance than ChatGPT and GPT-4. Despite their remarkable general ability, ChatGPT and GPT-4 showed difficulty in this learning task.
“Large-scale language models like ChatGPT have improved in recent years, but still struggle with compositional generalization,” says Universitat Pompeu Fabra’s Computational Linguistics and Language Theory Research Group. Baroni observes.
“However, we believe that MLC can further improve the composition skills of large-scale language models.”
About this artificial intelligence research news
author: james devitt
sauce: new york university
contact: James Devitt – New York University
image: Image credited to Neuroscience News
Original research: Open access.
“Human-like systematic generalization using meta-learning neural networksWritten by Brenden Lake et al. Nature
abstract
Human-like systematic generalization using meta-learning neural networks
The power of human language and thought derives from systematic compositionality, the algebraic ability to understand known building blocks and create new combinations.
Fodor and Pilyshyn famously argued that artificial neural networks lack this ability and are therefore not viable models of the mind. Since then, neural networks have come a long way, but systematicity challenges remain.
Here we successfully address Fodor and Pylysyn’s challenge by providing evidence that neural networks can achieve human-like systematicity when optimized for compositional skills.
To do this, we introduce a Meta-Learning for Compositionality (MLC) approach that guides training through a dynamic stream of compositional tasks. To compare humans and machines, we conducted human behavioral experiments using an instruction learning paradigm.
After considering seven different models, we found that they lack the systematicity and necessary for human-like generalization, as opposed to fully systematic but rigid probabilistic symbolic models and fully flexible but unsystematic neural networks. We found that only MLC offers both flexibility. MLC also improves the composition skills of machine learning systems in several systematic generalization benchmarks.
Our results demonstrate how standard neural network architectures optimized for compositional skills can mimic systematic human generalization in direct comparisons.