[A version of this piece first appeared in TechCrunch’s robotics newsletter, Actuator. Subscribe here.]
The topic of generative AI is frequently covered in my newsletter Actuator. I’ll admit that a few months ago I was a little hesitant to spend more time on this subject. Anyone who’s been reporting on technology for as long as I have has lived through countless hype cycles and gotten burned before. We all need a healthy dose of skepticism when it comes to reporting on technology, but we hope it’s a little less exciting about what’s possible.
This time, it seemed like generative AI was sitting in the wings, waiting for the inevitable crypto crater. As the category bleeds, projects like ChatGPT and DALL-E receive all the attention and breathless coverage, anticipation, criticism, doomism, and various Kubler-Russian phases of the tech hype bubble. It was ready and waiting.
Those of you who follow my articles know that I have never been particularly bullish on cryptocurrencies. However, things are different with generative AI. First of all, there is widespread, almost universal agreement that artificial intelligence/machine learning will play a more intensive role in our lives in the future.
Smartphones provide great insight here. Computational photography is something I write about regularly. I think there’s been great progress on that front in recent years, and many manufacturers are finally able to strike a better balance between hardware and software, both in terms of improving the end product and lowering barriers to entry. For example, Google has done some really impressive tricks with editing features like Best Take and Magic Eraser.
Sure, they’re neat tricks, but they’re also useful, not just feature’s sake. However, the real trick going forward is seamlessly integrating them into the experience. In an ideal future workflow, most users will have little or no understanding of what is happening behind the scenes. They’ll just be happy that it works. This is a classic Apple playbook.
Generative AI offers a similar “wow” effect from the get-go, but this is where it differs from previous hype cycles. If your less tech-savvy relative can sit down at a computer, type a few words into an interactive field, and watch the black box spit out a picture or short story, it doesn’t require much conceptualization. This is a big part of why this whole thing spread so quickly. Most of the time, when people tout cutting-edge technology, they have to imagine what it will look like in 5 or 10 years.
You can try it out right away using ChatGPT, DALL-E, etc. The flip side of this, of course, is how difficult it is to temper expectations. People tend to want to imbue robots with human or animal intelligence, but without a basic understanding of AI, it’s easy to project intentionality here. But that’s the situation now. We lead with attention-grabbing headlines and hope people stick around long enough to read about the conspiracy behind them.
Spoiler alert: 9 times out of 10, it doesn’t happen, and suddenly we end up spending months or years trying to bring things back to reality.
One of the great perks of my job is that I get to solve these problems with people who are much smarter than I am. They take the time to explain things, so I hope they translate it well for the reader (some attempts are more successful than others).
After it became clear that generative AI would play a key role in the future of robotics, I’ve been finding ways to squeeze questions into conversations. It’s interesting to see that most people in this field agree with the statement in the previous sentence, and to see the breadth of impact they believe it has.
For example, in a recent conversation with Mark Lybert and Gil Pratt, the latter explained the role generative AI is playing in their approach to robot learning:
We figured out how to do something. It uses the latest generative AI techniques that allow humans to demonstrate both position and force, essentially teaching the robot from just a handful of examples. The code hasn’t changed at all. What is this based on? There is a popularization policy. This is a study we conducted in collaboration with Columbia and MIT. We have taught 60 different skills so far.
Last week, I asked Deepu Talla, vice president and general manager of embedded and edge computing at Nvidia, why the company believes generative AI is more than just a fad. I answered.
I think this is reflected in the results. We are already seeing productivity gains. You can compose an email. Not exactly, but you don’t have to start from scratch. It’s giving me 70%. We can already see that the step function is definitely better than before. Summarizing something is not perfect. I’m not going to have you read and summarize it for me. So we’re already seeing signs of increased productivity.
Meanwhile, in his final conversation with Daniela Russ, the MIT CSAIL director explained how researchers are actually using generative AI to design robots.
Generative AI turns out to be very powerful in solving even motion planning problems. You get a much faster solution and a more fluid and human-like control solution than using model prediction solutions. I think this is very powerful because the robots of the future will be much less robotic. Their movements become more fluid and human-like.
We also used generative AI in the design. This is very powerful. Also, this is very interesting because it’s not just pattern generation for robots. I have to do something else. You can’t just generate patterns based on data. Machines must make sense in the context of physics and the physical world. So we connect them to a physics-based simulation engine to ensure that the design meets the required constraints.
This week, the Northwestern University team Presenting original research For designing AI-generated robots. Researchers demonstrated how they designed a robot that can walk normally in just a few seconds. As these things go, there’s not much to note, but with additional research it’s easy enough to see that more complex systems can be created using this approach.
“We have discovered an extremely fast AI-driven design algorithm that avoids evolutionary traffic jams without relying on the biases of human designers,” said lead researcher Sam Kriegman. Masu. “We told the AI we wanted a robot that could walk on land. All we had to do was push a button. We wanted a robot that looked like no other animal that had ever walked the earth. A blueprint was generated in the blink of an eye. I call this process “instantaneous evolution.” ”
It was the AI program’s choice to add legs to the small, squishy robot. “This is interesting because we haven’t told the AI that the robot needs legs,” Kriegman added. “We have rediscovered the power of feet to move on land. In fact, leg locomotion is the most efficient form of land locomotion.”
“From my perspective, generative AI and physical automation/robotics are going to change everything we know about life on Earth,” Formant founder and CEO Jeff Linnell told me this week. Told. “I think we all accept the fact that AI is real and expect our jobs, all businesses and students to be affected. You don’t need to program the robot. You can talk to it in English, ask it for an action, and it will understand it. It will take about a minute.”
Prior to joining Formant, Linnell founded and served as CEO of Bot & Dolly. The San Francisco-based company is best known for its work on Gravity, but it was cracked down by Google in 2013 as the software giant set its sights on accelerating the industry (including detailed planning). The executive told me that the key takeaway from that experience was that it’s all about the software (I think Google agrees, given that Intrinsic Robots and Everyday Robots were absorbed into DeepMind). ).