Mon. Dec 23rd, 2024
Openai Provides Audio Of Verbal Conversations For Chatgpt

ChatGPT uses OpenAI to evolve into much more than a text-based search engine announcement Today, new audio and image-based smart features are being added to the mix.

The hugely popular generative AI assistant has been one of the biggest technology success stories in recent years since its debut about nine months ago, allowing anyone to write essays, poems, and summaries from simple text-based prompts. You will be able to generate it. But now, ChatGPT has become even more interactive, allowing users to have voice conversations with chatbots.

The announcement came on the same day Amazon pledged to invest up to $4 billion in OpenAI rival Anthropic. The move forms part of a larger generative AI battle between the world’s tech giants, including Google, which is trying to catch up through its Bard chatbot. , Meta has a strong open source ethos to give it an edge, with Microsoft working closely with OpenAI itself.

conversation starter

Now, OpenAI is merging the familiar world of voice-based assistants with its powerful Large Language Models (LLMs), marking a remarkable evolution of the generative AI movement.

For example, a user can verbally ask ChatGPT to create a bedtime story on the fly, and it will give you some voice prompts to guide you through the story. Alternatively, users can simply ask a question and ChatGPT can respond in audio format.

Elsewhere, ChatGPT users can also search for answers using images. For example, you can upload a photo of something and have ChatGPT explain what it is or provide steps to achieve your goal.

ChatGPT image search

ChatGPT image search image credits: Open AI

Speech features utilize a new text-to-speech model that can generate human-like voices from text and a few seconds of sampled audio. OpenAI said it used the open source Whisper speech recognition system to transcribe oral utterances into text and worked with well-known voice actors to create five different voices.

spotify Also announced as a launch partnerThe music streaming giant is introducing a super cool new feature for podcasters that lets them sample their own voice and translate their shows from English to Spanish, French, or German while preserving their original voice. did. However, OpenAI seems to be careful not to invite criticism as it is not making this technology available to everyone. Specifically, for this release, we collaborated with podcasters such as Dax Shepard, Monica His Padman, Rex Fridman, Bill Simmons, and Stephen Bartlett.

“New audio technology that can create realistic synthetic speech from just a few seconds of real audio opens the door to many creative and accessibility-focused applications,” the company said in a blog post. “However, these capabilities also introduce new risks, such as the potential for malicious actors to impersonate celebrities and commit fraud.”

The new features will begin rolling out to paid Plus and Enterprise subscribers within the next two weeks. To enable voice functionality, users must go to the app’s “Settings” menu, then “New Features” and opt-in to voice conversations. Next, you need to tap on the headphone button in the top right corner and select the audio you want.

Voice will initially be limited to the ChatGPT Android and iOS apps on an opt-in beta basis, while image search will be brought to all platforms by default.