The “T” at the end of ChatGPT can be credited to Aidan Gomez. He was part of a group…
The “T” at the end of ChatGPT can be credited to Aidan Gomez. He was part of his group of Google engineers who first introduced a new artificial intelligence model called Transformers.
This helped lay the foundation for today’s generative AI boom, which ChatGPT creator OpenAI and others have built on top of. Gomez, one of eight co-authors on Google’s 2017 paper, was a 20-year-old intern at the time.
He is currently the CEO and co-founder of Cohere, a Toronto-based startup that competes with other large AI companies by providing large-scale language models and chatbots to large enterprises and organizations. doing.
Gomez spoke with The Associated Press about the future of generative AI. The interview has been edited for length and clarity.
Q: What is a transformer?
A: Transformers are the architecture of a neural network, the structure of the computations that take place within the model. What makes Transformers special compared to other competing architectures and other methods of building neural networks is essentially that they are extremely scalable. You can train across thousands or even tens of thousands of chips. They can be trained very quickly. These GPUs (graphics chips) use a variety of optimized operations. Their processing is faster and more efficient compared to what existed before the advent of transformers.
Q: How important are they to what we do at Cohere?
A: Very important. We, like others, use the transformer architecture when building large-scale language models. At Cohere, we have a big focus on scalability and production readiness for enterprises. Some of the other models we compete with are huge and highly inefficient. You can’t actually deploy it into production. Because as soon as you face real users, costs quickly mount and the economy collapses.
Q: What are some specific examples of how customers are using Cohere models?
A: I have a favorite example from the healthcare field. It comes from the surprising fact that 40% of a doctor’s working day is spent writing patient notes. So what if doctors were to attach little passive listening devices that accompany patients throughout the day between patient visits, allowing them to listen to conversations and pre-type those notes? That way, you don’t have to write notes from the beginning. The first draft is there. All they have to do is read it and edit it. Suddenly, doctors become much more capable.
Q: How do you address customer concerns that AI language models are prone to “illusions” (errors) and bias?
A: Customers are always worried about illusions and bias. It leads to a bad product experience. So we value that. When it comes to hallucinations, we focus on RAG, which is search augmented generation. We have released a new model called Command R that explicitly targets RAGs. This allows you to connect your model to a private source of trusted knowledge. It could be an organization’s internal documents or a particular employee’s email. You are giving the model access to information that was not publicly available on the web while the model was training. Importantly, you can also fact-check the model. This is because the model actually references the document, rather than just text input and text output. You can cite where you got that information. Seeing it in action will give you more confidence in operating the tool. Significantly reduces hallucinations.
Q: What is the public’s biggest misconception about generative AI?
A: The fear that certain individuals and organizations have that this technology will become a terminator, an existential risk. These are the stories humans have been telling themselves for decades. Technology is coming, taking over us, displacing us, making us subordinate. They are very deeply embedded in the cultural brainstem of the masses. It’s a very remarkable story. When you communicate that, it’s easier to capture people’s imaginations and fears. We pay a lot of attention to it because it’s very appealing as a story. But the reality is, I think this technology is going to be very good. There’s a lot of debate about how bad things can go, but those of us who develop the technology are well aware of those risks and are working to mitigate them. We all hope this works. We all want technology to be an addition to humanity, not a threat to it.
Q: In addition to OpenAI, many major technology companies have now stated that they are working to build artificial general intelligence (a broad term for artificial intelligence that is better than humans). Is AGI part of your mission?
A: No, I don’t think that’s part of my mission. For me, AGI is not the end goal. The ultimate goal is for this technology to have a significant positive impact on the world. It’s a very common technique. It is reasoning, it is intelligence. Therefore, it applies everywhere. And we want to make sure it’s the most effective technology possible as early as possible. It’s not some quasi-religious pursuit of AGI that we don’t even quite understand the definition of.
Q: What happens next?
A: I think everyone should be careful about using tools and acting more like an agent. A model that allows you to present a tool that you have built yourself for the first time. Perhaps it’s a software program or an API (application programming interface). And you can say, “Hey model, I just made this.” Here’s what it does. Here’s how to work with it: This is part of the toolkit of things you can do. ” I think the general principle of giving a model a tool it has never seen before and being able to employ it effectively is going to be very powerful. To do many things, you need access to external tools. Currently, the model can only write (text) characters back to you. When you give them access to tools, they can actually take action on your behalf in the real world.
Copyright © 2024 Associated Press. All rights reserved. This material may not be published, broadcast, written or redistributed.