Loading

Microsoft Ignite 2023 Unveils AI Tool for Crafting Photorealistic Speaking Avatars

  • Isla MacDonald
  • Nov 16, 2023
  • 281
Microsoft Ignite 2023 Unveils AI Tool for Crafting Photorealistic Speaking Avatars

At the Microsoft Ignite 2023 event, a remarkable new tool has been unveiled that allows for the creation and animation of photorealistic avatars that can articulate things the actual person may not have uttered. Named the Azure AI Speech text-to-speech avatar, this tool has launched into the public preview realm and empowers users to produce videos of avatars that emulate a particular person's appearance and voice. This is achieved by uploading the individual's images and scripting out their intended spoken words. Microsoft's sophisticated technology then applies this data, animating the avatar and pairing it with a vocal rendition of the script using either a prebuilt voice model or one trained on the actual person's voice.

Microsoft envisions a variety of applications for this tool, from developing training materials and product walkthroughs to crafting customer testimonials and other video content, all through straightforward text inputs. Furthermore, these avatars can be integrated into chatbot scenarios and virtual assistant interfaces, capable of conversing in multiple languages and even improvising responses using AI like OpenAI's GPT-3.5 for unscripted queries.

The technology's potential for misuse is not lost on Microsoft, as similar avatar-generating tools have been implicated in the spread of false information and propaganda. As a precaution, most Azure customers are currently only permitted to use prebuilt models, with the option to create custom avatars being gatekept and available on a limited basis for specific applications upon registration.

The introduction of such a feature inevitably leads to thorny ethical debates, one issue being the proper compensation for the use of actors' digital likenesses, a hot topic during recent industry strikes. Microsoft has yet to publicly respond to inquiries regarding its stance on this matter and whether it will impose requirements for its customers to clearly identify avatars as AI-generated, as is becoming more common on platforms such as YouTube.

Nevertheless, Microsoft insists on strict compliance for custom avatar usage, demanding explicit written consent from the subjects and specifying that customers delineate the scope and limitations regarding the use of these avatars. Additionally, AI-origin disclosures are mandated for these avatars to inform viewers of their synthetic nature.

Parallel to this, Microsoft is launching another generative AI tool called personal voice during the Ignite event. This feature aims to clone a user's voice from just one minute of recorded speech, offering potential uses in voice assistants, content translation, and personalized audio content. To safeguard against abuse and legal complications, Microsoft has put stringent measures in place. Users must provide clear consent, and there are restrictions ensuring that any synthesis of speech remains within applications, prohibiting the distribution or sharing of the generated content.

Questions were raised about the remuneration of voice actors and the presence of watermarks to distinguish between voices created by artificial intelligence. Microsoft later clarified that watermarks will indeed be automatically applied to personalized voices to help identify synthetically generated speech. However, the implementation of watermarks is contingent on Microsoft's approval, opening up another multi-layered complexity in the evolving relationship between technology and content creation.

Share this Post: