Technology Google's new AI turns text into music

Google’s new AI turns text into music


Google researchers have created an AI that can generate minute-long pieces of music from text prompts, and even convert a whistled or humming melody into other instruments, similar to how systems like DALL-E generate images from written prompts (through TechCrunch). The model is called MusicLM, and while you can’t play with it yourself, the company has uploaded some examples which it has produced using the model.

The examples are impressive. There are 30-second snippets of what sound like real songs created from paragraph-long descriptions that prescribe a genre, mood, and even specific instruments, as well as five-minute snippets generated from a word or two like “melodic techno ‘. Perhaps my favorite is a demo of the ‘story mode’ where the model is basically given a script to change between prompts. For example this prompt:

electronic track played in a video game (0:00-0:15)

meditation song played next to a river (0:15-0:30)

fire (0:30-0:45)

fireworks (0:45-0:60)

Resulted in the audio you can listen to here.

It may not be for everyone, but I could totally tell this was composed by a human being (I’ve also listened to it dozens of times while writing this article). Also on the demo site are examples of what the model produces when asked to generate 10 second clips of instruments such as the cello or maracas (the later example is one where the system does relatively poorly), 8 second clips of a certain genre, music that would suit a prison break, and even what a novice pianist would sound like versus an advanced one. It also contains interpretations of phrases such as ‘futuristic club’ and ‘accordion death metal’.

MusicLM can even simulate human vocals, and while it seems to get the tone and overall sound of voices right, there’s a quality that’s definitely off. The best way I can describe it is that they sound grainy or static. That quality is not so clear in the example above, but I think this one illustrates it quite well.

Which, by the way, is the result of being asked to make music that would play in a gym. You might also have noticed that the lyrics are nonsense, but in a way that you might not necessarily notice if you’re not paying attention — something like listening to someone sing in Simlish or that one song that should sound like English, but isn’t.

I won’t pretend to know how Google got these results, but it is released an investigative report explain it in detail if you are the type of person who would understand this figure:

Image showing part of MusicLM's process involving SoundStream, w2v-BERT and MuLan.
A figure explaining the “hierarchical sequence-to-sequence modeling task” used by the researchers AudioLM, another Google project.
Graphics: Google

AI-generated music has a long history stretching back decades; there are systems that are credited with composing pop songscopying Bach better than a human could do in the 90sand accompanying live performances. A recent version uses the AI ​​image generation engine, StableDiffusion turn text prompts into spectrograms which are then turned into music. The paper says MusicLM can outperform other systems in terms of “caption quality and compliance,” as well as the fact that it can record audio and copy the tune.

That last part is arguably one of the coolest demos the researchers have released. The site lets you play the input audio, where someone hums or whistles a tune, then shows you how the model reproduces it as an electronic synth lead, string quartet, guitar solo, etc. From the samples I listened to, manages it does the job very well.

As with other forays into this type of AI, Google is on it significantly more careful with MusicLM than some of its peers with similar technology. “We have no plans to release models at this time,” the paper concludes, citing the risks of “potential misappropriation of creative content” (read: plagiarism) and possible cultural appropriation or misrepresentation.

It’s always possible that the technology will show up at some point in one of Google’s fun musical experiments, but for now the only people who can take advantage of the research are other people building musical AI systems. Google says it is publicly releasing a dataset containing about 5,500 music-text pairs, which could help train and evaluate other musical AIs.

Shreya Christina
Shreya has been with for 3 years, writing copy for client websites, blog posts, EDMs and other mediums to engage readers and encourage action. By collaborating with clients, our SEO manager and the wider team, Shreya seeks to understand an audience before creating memorable, persuasive copy.

Latest news

1xbet Зеркало Букмекерской Конторы 1хбет На следующий ️ Вход и Сайт Прямо тольк

1xbet Зеркало Букмекерской Конторы 1хбет На следующий ️ Вход и Сайт Прямо только1xbet Зеркало на Сегодня Рабочий официальный Сайт...

Mostbet Pakistan ᐉ Online Casino Review Official Website

Join us to dive into an immersive world of top-tier gaming, tailored for the Kenyan audience, where fun and...

Casino Pin Up Pin-up Casino Resmi Sitesi Türkiye Proloq Ve Kayıt Çevrimiçi

ContentPin Up Nə Say Onlayn Kazino Təklif Edir?Pin Up Casino-da Pul Çıxarmaq Nə Miqdar Müddət Alır?Vəsaiti Kartadan Çıxarmaq üçün...

Играть В Авиатора: Самолетик Pin Up

ContentAviator: Son Qumar Oyunu Təcrübəsini AçınMobil Proqram Pin UpPin Up Aviator Nasıl Oynanır?Бонус За Регистрацию В Pin Up?Pin Up...

Pin Up 306 Casino əvvəl Qeydiyyat, Bonuslar, Yukl The National Investo

ContentDarajalarfoydalanuvchilar Pin UpCasino Pin-up Pin-up On Line Casino Resmi Sitesi Türkiye Başlanğıc Ve Kayıt ÇevrimiçPromosyon Və Qeydiyyatdan KeçməkAviator OyunuAviator...

Find Experts to Write My Paper for Me. Just Click a Button Even though you may have many...

Must read

You might also likeRELATED
Recommended to you