MIT’s machine learning system based on light could produce more powerful large-scale language models

by Elizabeth A. Thomson, MIT Materials Laboratory
September 24, 2023

Contents

towards the future the heartbeat of progress Optical neural networks and their potential

An artist’s rendition of a light-based computer system that could potentially power the power of machine learning programs like ChatGPT. The blue area represents the micron-scale laser, which is the key to this technology.Credit: Ella Mar Studio

Massachusetts Institute of Technology The system has been demonstrated to be more than 100 times more energy efficient and 25 times more computationally dense compared to current systems.

ChatGPT has become a hot topic around the world for its ability to create essays, emails, and computer code based on a few prompts from users. Now, an MIT-led team has reported a system that could lead to machine learning programs that are orders of magnitude more powerful than the program behind ChatGPT. The system they developed has the potential to use orders of magnitude less energy than the most advanced supercomputers behind today’s machine learning models.

In recent issues, natural photonics, researchers report the first experimental demonstration of a new system that uses lasers on the scale of hundreds of microns to perform calculations based on the movement of light rather than electrons. The researchers report that the new system improves energy efficiency by more than 100 times and computational density, a measure of a system’s power, by 25 times compared to state-of-the-art digital computers. machine learning.

towards the future

In their paper, the research team also notes that “improvements that are orders of magnitude larger are needed for future improvements.” As a result, the authors continue, the technology “paves the way for large-scale optoelectronic processors to accelerate machine learning tasks from data centers to distributed edge devices.” In other words, mobile phones and other small devices could be able to run programs that can currently only be computed in large data centers.

Additionally, because the system’s components can be made using manufacturing processes already in use today, “we expect that it can be scaled up for commercial use within a few years. For example, the associated laser array could be It is widely used in ID and data communication,” said lead author Zhaijun Chen. He conducted this research during his postdoctoral years at the MIT Electronics Laboratory (RLE), where he currently serves as an assistant professor. Professor at the University of Southern California.

“We’re excited to be able to demonstrate that this technology is so important to our customers,” said Dirk Englund, an associate professor in MIT’s Department of Electrical Engineering and Computer Science and leader of the study. It is not economically viable to train much larger models. Our new technology may enable leaps into otherwise unattainable machine learning models in the near future. ”

He continued, “We don’t know what capabilities the next generation of ChatGPT will have if it’s 100 times more powerful, but that’s the kind of discovery regime possible with this kind of technology.” I am. England also leads MIT’s Quantum Photonics Laboratory and is affiliated with the RLE and Materials Research Institute.

the heartbeat of progress

The current study is the latest in advances made by England and many of the same colleagues over the past few years. For example, in 2019 a team from the UK reported theoretical research that led to the current demonstration. Ryan Hamerly, the first author of this paper, is currently with RLE and NTT Research Inc. and is also an author of this paper.

Current additional co-authors natural photonics Papers by Alexander Sludds, Ronald Davis, Ian Christen, Liane Bernstein, and Lamia Atesian, all RLE. Tobias Heuser, Niels Hellmeyer, James A. Lott, and Stefan Reizenstein from the Technical University of Berlin;

Deep neural networks (DNNs), like the one behind ChatGPT, are based on huge machine learning models that simulate the way the brain processes information. But even as the field of machine learning grows, the digital technology behind today’s DNNs is reaching its limits. Moreover, they require huge amounts of energy and are mainly limited to large data centers. It motivates the development of new computing paradigms.

Optical neural networks and their potential

Performing DNN calculations using light rather than electrons could potentially break through current bottlenecks. For example, calculations using optics can require much less energy than calculations based on electronics. Additionally, using optics, Chen says, “we can achieve much greater bandwidth,” meaning greater computational density. Light can transfer more information in a much smaller area.

However, current optical neural networks (ONNs) have significant challenges. For example, converting incoming data based on electrical energy into light is inefficient and consumes a large amount of energy. Additionally, the components involved are bulky and take up considerable space. ONNs are good at linear calculations such as addition, but poor at nonlinear calculations such as multiplication and “if” statements.

In the current study, the researchers introduce a compact architecture that solves all these challenges and two more simultaneously for the first time. This architecture is based on state-of-the-art vertical surface emitting laser (VCSEL) arrays, a relatively new technology used in applications such as LIDAR remote sensing and laser printing. The specific her VCEL reported in natural photonics The paper was developed by the Leizenstein Group at the Technical University of Berlin. “This was a collaborative project that would not have been possible without them,” Hameli says.

Assistant Professor Logan Wright yale university He was not involved in the current study and commented: It is an inspiration and encouragement to me and probably many other researchers in this field that systems based on modulated VCSEL arrays can be a viable route to large-scale, high-speed optical neural networks. Of course, the state-of-the-art technology here is still far from the scale or cost required for practical devices, but it’s interesting to see what could be achieved in the coming years, especially given the potential for these systems to accelerate. is optimistic. It’s a very large and very expensive AI system like the one used in popular text “GPT” systems like ChatGPT. ”

Reference: “Deep Learning with Coherent VCSEL Neural Networks” by Zaijun Chen, Alexander Sludds, Ronald Davis III, Ian Christen, Liane Bernstein, Lamia Ateshian, Tobias Heuser, Niels Heermeier, James A. Lott, Stephan Reitzenstein, Ryan Hamerly, Dirk Englund, July 17, 2023; natural photonics.
DOI: 10.1038/s41566-023-01233-w

Chen, Hameli, and England filed a patent application for this research, which was supported by the U.S. Army Research Office, NTT Research, the U.S. Defense Science and Engineering Graduate Fellowship Program, the U.S. National Science Foundation, and the Natural Sciences and Engineering. Canada Research Council and Volkswagen Foundation.

MIT’s machine learning system based on light could produce more powerful large-scale language models

Byautomateinsider

towards the future

the heartbeat of progress

Optical neural networks and their potential

By automateinsider

Related Post

Top 5 Cybersecurity Stories This Week – Cyber Magazine

Verizon says customer service agent GoogleAI led to sales jumps – Yahoo Finance

AI is coming to detect skin cancer – Washington Post

Introducing AI for customer service

You missed

Meta’s Vanilla Maverick AI Model ranks under rivals in the popular chat benchmark

Top 5 Cybersecurity Stories This Week – Cyber Magazine

How Tiktok’s parent, bytedance, became a great AI powerhouse – New York Times

Reduce ML training costs with Amazon Sagemaker HyperPod – Amazon Web Services

Automate insider