Meta Platforms Joined AI Adoption Software Extravaganza

Jun 19, 2023
842

Formerly Facebook (FB), Meta Platforms (META) has launched so-called Voicebox, its generative AI model for text-to-speech that helps with audio editing, sampling and stylizing. Voicebox is the latest addition to the series of AI models, similar to Microsoft-backed (MSFT) OpenAI ChatGPT which is known for text-related content, while Dall-E is good at image generation.

Facebook's parent company says Voicebox can create high-quality audio clips and edit pre-recorded audio – such as removing noise pollution factors, for example, car horns and barking dogs – while preserving the content and style of the audio. The model is remarkably poly lingual and can generate speech in 6 languages (more to come).

By using audio samples as short as two seconds, Voicebox can customize the audio style and use it for text-to-speech generation. Voicebox can also restore parts of a speech interrupted by noise or replace a badly audible word(-s) without re-recording the entire speech.

Additionally, if Voicebox receives a sample of a person's speech and a piece of text in English, French, German, Spanish, Polish, or Portuguese, it can read the text in any of those languages, even if the sample speech and text are in different languages.

The AI ​​model can also generate speech that is more representative of how people speak in the real world and in the six languages ​​mentioned. Meta shares closed Friday’s session with a 5.2% weekly gain.