top of page

Meta's Tools Open Up New Possibilities for Music Creation

Music is one of the most expressive and creative forms of art, but it also requires a lot of skill, knowledge and equipment to produce. What if you could create music with just a few clicks, using the power of artificial intelligence?

That's the idea behind Meta MusicGen, a new open-source project from Meta (formerly Facebook) that aims to democratize music creation and enable anyone to generate sounds and music with AI models. Meta MusicGen is based on two models: Meta SoundGen and Meta MelodyGen.

- Based on articles by Kyle Wiggers in TechCrunch (August 2, 2023) and by Garling Wu in (July 16, 2023).

  • Meta SoundGen is a model that can generate realistic sounds from any category, such as animals, instruments, vehicles, etc. You can use it to create sound effects, ambient sounds, or even mix and match different sounds to create new ones.

  • Meta MelodyGen is a model that can generate melodies and harmonies from any genre, such as classical, jazz, rock, pop, etc. You can use it to create original tunes, remix existing songs, or even collaborate with other musicians online.

According to Meta, MusicGen can produce high-quality music samples, though we discovered that the researchers involved defined high-quality as 32kHz. This sits somewhere in between the requirements of speech synthesis (16kHz) and the standard for digital music (44.1kHz). In reality, the audio doesn't meet the minimum quality standards you might be used to hearing on the radio or streaming platforms. However, compared to other AI music generators, and considering where the technology is at the time of writing, the audio quality is fairly good with a low noise level in the file.

Meta MusicGen is not only a tool for music enthusiasts, but also a platform for music researchers and developers. You can access the models and the code on GitHub, and experiment with different parameters, datasets and techniques. You can also contribute to the project by sharing your feedback, ideas and creations.


Meta announced AudioCraft, a framework to generate what it describes as “high-quality,” “realistic” audio and music from short text descriptions, or prompts. It’s not Meta’s first foray into audio generation — the tech giant open sourced an AI-powered music generator, MusicGen, in June — but Meta claims that it’s made advances that vastly improve the quality of AI-generated sounds, such as dogs barking, cars honking and footsteps on a wooden floor.

AudioCraft contains three generative AI models: MusicGen, AudioGen and EnCodec.

  • AudioGen, the other audio-generating model contained in AudioCraft, focuses on generating environmental sounds and sound effects as opposed to music and melodies.

  • AudioGen is a diffusion-based model, like most modern image generators (see OpenAI’s DALL-E 2, Google’s Imagen and Stable Diffusion). In diffusion, a model learns how to gradually subtract noise from starting data made entirely of noise — for example, audio or images — moving it closer step by step to the target prompt.

  • The last of AudioCraft’s three models, EnCodec, is an improvement over a previous Meta model for generating music with fewer artifacts. Meta claims that it more efficiently models audio sequences, capturing different levels of information in training data audio waveforms to help craft novel audio.


Summary by The New Bing!


bottom of page