Meta has this month unveiled a new AI sound generator called AudioBox, which is set to transform the way we interact with sound. This innovative tool allows users to convert text into speech, compose music, and create sound effects with ease, using simple text prompts. The introduction of AudioBox marks a significant step forward in making the creation of custom audio content more accessible to a wide range of users.
Meta’s Audiobox text-to-sound audio creation system is the successor to Voicebox. Advancing generative AI for audio even further by unifying generation and editing capabilities for speech, sound effects (short, discrete sounds like a dog bark, car horn, a crack of thunder, etc.), and soundscapes, with a variety of input mechanisms to maximize controllability for each use case.
AudioBox comes packed with a variety of features that meet a wide array of audio generation needs. For instance, its text-to-speech capability provides users with the ability to turn written text into realistic speech, offering a range of voice options to choose from. Those interested in music can use AudioBox to craft music tracks without needing to master traditional instruments or complex software. Additionally, the tool is capable of producing tailored sound effects, which can be particularly useful for gaming, film, and other multimedia projects. Users can customize audio outputs to their specific needs using intuitive text or audio prompts.
Text to sound AI audio generation
Meta has designed Audiobox to enable people to create sounds using natural language prompts to describe the sound or type of speech they want to create. For example if you would like to create a new sound, simply enter a text prompt like, “A running river and birds chirping” into the AI model. Watch the video below for an overview and demonstration of its current capabilities.
Here are some other articles you may find of interest on the subject of AI sound
Moreover, AudioBox is not just a basic sound generator; it includes advanced features that push the boundaries of AI-generated audio. One such feature is voice cloning, which allows the duplication of any voice from a sample, offering a personalized audio creation experience. The tool can also restyle existing audio to fit different contexts and edit or replace segments of audio seamlessly with AI-generated content, a process known as audio inpainting.
“Audiobox demonstrates state-of-the-art controllability on speech and sound effects generation. Our own tests show it significantly surpasses prior best models (AudioLDM2, VoiceLDM, and TANGO) on quality and relevance (faithfulness to text description) in subjective evaluations. Audiobox outperforms Voicebox on style similarity by over 30 percent on a variety of speech styles.”
Availability and pricing
Meta is currently making Audiobox available to a hand-selected group of researchers and academic institutions with a track record in speech research to help further the state of the art in this research area. The company is committed to ensuring that AudioBox is used ethically and responsibly. The company has implemented safeguards to prevent potential misuse and ensure that the AI adheres to moral guidelines. This commitment to ethical use is further demonstrated by a grant application that supports research into the safe application of AudioBox.
Another exciting feature of AudioBox is the AudioBox Maker, which allows users to construct complex audio scenes by layering sounds and music. This enables the creation of sophisticated and immersive soundscapes that can enhance any audio experience. Meta’s Audiobox interactive demo and research paper are now available allowing you to test out the new foundation research model for audio generation.
AudioBox is poised to make a significant impact on the audio production industry. Its comprehensive features and dedication to ethical use mean that content creators, musicians, and developers can look forward to a new realm of possibilities. As we await further updates on AudioBox, including its potential open-source release and the outcomes of ongoing safety and responsibility research, it’s clear that this tool is set to become an indispensable asset in the world of audio production.
Filed Under: Guides, Top News
Latest timeswonderful Deals
Disclosure: Some of our articles include affiliate links. If you buy something through one of these links, timeswonderful may earn an affiliate commission. Learn about our Disclosure Policy.