One of the unexpected products coming out of the Microsoft Ignite 2023 event is a tool that lets you create photo-realistic avatars of people and animate them to say things they don’t necessarily say. .
The new feature, called Azure AI Speech Text-to-Speech Avatar, is currently available in public preview and allows users to upload an image of the person they want their avatar to resemble and write a script to generate a video of their avatar speaking. Microsoft’s tool trains the model that drives the animation, while a separate text-to-speech model (either pre-built or trained with a human voice) “reads” the script aloud.
“Text-to-speech avatars allow users to more efficiently create videos for training videos, product introductions, and customer testimonials. [and so on] Simply enter text. ” Microsoft writes in a blog post. “You can use avatars to build conversational agents, virtual assistants, chatbots, and more.”
Avatars can speak in multiple languages. Also, in chatbot scenarios, AI models such as OpenAI’s GPT-3.5 can be utilized to respond to unscripted questions from customers.
There are currently countless ways in which such tools can be misused, and Microsoft, to its credit, recognizes that. (Similar avatar generation technology by AI startup Synthesia abuse Producing propaganda in Venezuela error News reports promoted by pro-Beijing social media accounts. ) Most Azure subscribers only have access to pre-built avatars at launch, rather than custom ones. Microsoft says custom avatars are currently a “restricted access” feature that is available only by registration and is only available for “certain use cases.”
However, this feature raises a number of unpleasant ethical questions.
One of the key issues in the recent SAG-AFTRA strike was the use of AI to create digital caricatures. The studio ultimately agreed to pay the actors for the AI-generated likenesses. But what about Microsoft and its customers?
I asked Microsoft what its stance is on companies that use an actor’s likeness without proper compensation or even notice based on the actor’s views. The company did not respond. Nor did he say whether companies would be required to label their avatars as AI-generated. YouTube and increasing number of other platforms.
personal voice
Microsoft appears to have more guardrails in place for Personal Voice, a related generative AI tool that will also be announced at Ignite.
Personal Voice, a new feature within Microsoft’s custom neural voice service, can replicate a user’s voice in seconds by providing a one-minute audio sample as a voice prompt. Microsoft is pitching it as a way to create personalized voice assistants, dub content into different languages, and generate bespoke narration for stories, audiobooks, and podcasts.
To avoid potential legal issues, Microsoft requires users to give “explicit consent” in the form of a recorded statement before customers can use their personal voice to synthesize speech. requesting. For the time being, access to this feature is gated behind a registration form, and customers are limited to using their personal voice only in applications that “do not read out loud user-generated or open-ended content.” Must agree.
“Use of the speech model must occur within the application, and the output must not be publicly available or shareable from the application,” Microsoft wrote in a blog post. “[C]Customers who meet limited access eligibility criteria have sole control over the creation, access, and use of voice models and their outputs. [where it concerns] Dubbing of movies, television, video, and audio applies to entertainment scenarios only. ”
TechCrunch’s article on how actors will be compensated for posting personal voices, or whether Microsoft plans to implement some kind of watermarking technology to make AI-generated voices more easily identifiable He didn’t answer any questions.
For more information about Microsoft Ignite 2023, please see below.
This article was originally published on November 15th at 8am PT and updated at 3:30pm PT.