Technology Google's new AI turns text into music

Google’s new AI turns text into music


- Advertisment -

Google researchers have created an AI that can generate minute-long pieces of music from text prompts, and even convert a whistled or humming melody into other instruments, similar to how systems like DALL-E generate images from written prompts (through TechCrunch). The model is called MusicLM, and while you can’t play with it yourself, the company has uploaded some examples which it has produced using the model.

The examples are impressive. There are 30-second snippets of what sound like real songs created from paragraph-long descriptions that prescribe a genre, mood, and even specific instruments, as well as five-minute snippets generated from a word or two like “melodic techno ‘. Perhaps my favorite is a demo of the ‘story mode’ where the model is basically given a script to change between prompts. For example this prompt:

electronic track played in a video game (0:00-0:15)

meditation song played next to a river (0:15-0:30)

fire (0:30-0:45)

fireworks (0:45-0:60)

Resulted in the audio you can listen to here.

It may not be for everyone, but I could totally tell this was composed by a human being (I’ve also listened to it dozens of times while writing this article). Also on the demo site are examples of what the model produces when asked to generate 10 second clips of instruments such as the cello or maracas (the later example is one where the system does relatively poorly), 8 second clips of a certain genre, music that would suit a prison break, and even what a novice pianist would sound like versus an advanced one. It also contains interpretations of phrases such as ‘futuristic club’ and ‘accordion death metal’.

MusicLM can even simulate human vocals, and while it seems to get the tone and overall sound of voices right, there’s a quality that’s definitely off. The best way I can describe it is that they sound grainy or static. That quality is not so clear in the example above, but I think this one illustrates it quite well.

Which, by the way, is the result of being asked to make music that would play in a gym. You might also have noticed that the lyrics are nonsense, but in a way that you might not necessarily notice if you’re not paying attention — something like listening to someone sing in Simlish or that one song that should sound like English, but isn’t.

I won’t pretend to know how Google got these results, but it is released an investigative report explain it in detail if you are the type of person who would understand this figure:

Image showing part of MusicLM's process involving SoundStream, w2v-BERT and MuLan.
A figure explaining the “hierarchical sequence-to-sequence modeling task” used by the researchers AudioLM, another Google project.
Graphics: Google

AI-generated music has a long history stretching back decades; there are systems that are credited with composing pop songscopying Bach better than a human could do in the 90sand accompanying live performances. A recent version uses the AI ​​image generation engine, StableDiffusion turn text prompts into spectrograms which are then turned into music. The paper says MusicLM can outperform other systems in terms of “caption quality and compliance,” as well as the fact that it can record audio and copy the tune.

That last part is arguably one of the coolest demos the researchers have released. The site lets you play the input audio, where someone hums or whistles a tune, then shows you how the model reproduces it as an electronic synth lead, string quartet, guitar solo, etc. From the samples I listened to, manages it does the job very well.

As with other forays into this type of AI, Google is on it significantly more careful with MusicLM than some of its peers with similar technology. “We have no plans to release models at this time,” the paper concludes, citing the risks of “potential misappropriation of creative content” (read: plagiarism) and possible cultural appropriation or misrepresentation.

It’s always possible that the technology will show up at some point in one of Google’s fun musical experiments, but for now the only people who can take advantage of the research are other people building musical AI systems. Google says it is publicly releasing a dataset containing about 5,500 music-text pairs, which could help train and evaluate other musical AIs.

Shreya Christina
Shreya has been with for 3 years, writing copy for client websites, blog posts, EDMs and other mediums to engage readers and encourage action. By collaborating with clients, our SEO manager and the wider team, Shreya seeks to understand an audience before creating memorable, persuasive copy.


Please enter your comment!
Please enter your name here

Latest news

1xBet apk Скачать на Андроид и iOS Бесплатное официальное приложение

ContentХарактеристики мобильного приложение 1хБет на андроидПодскажите, можно ли скачать это приложение бесплатно на мобильный?Можно ли скачать приложение 1xBet с...

¿1xbet es confiable y legal en Chile? Resolvemos tus dudas en agosto 2023 Goal com Chile

También ten en cuenta factores como lesiones, cambios en la alineación y otros eventos que puedan influir en el...

1xbet казино официальный сайт 1хбет зеркало казино онлайн

ContentПреимущества и недостатки версии для смартфоновBet официальный сайт онлайнОбмен промокодов 1 xBet на баллыКак вывести деньги с казино?Промокод на...

1xBet APK Скачать на Андроид бесплатно на русском языке

ContentСкачать 1xbetЕсть ли бонусы за установку приложения 1 xBet на андроид?Ставки на спорт в приложениях 1xBetПоддерживаемые устройства AndroidПодскажите, можно...
- Advertisement -

Linkinizlə keçmiş hər yeni oyunçu daimi olaraq sizə təyin olunur Gətirdiyiniz hər bir oyunçu üçün xalis gəlirimizin 40%-ə...

ContentBet Azərbaycan bukmeker: rəsmi saytın nəzərdən keçirilməsiMinimum və maksimum tariflərMərclər 1xBetBet şəxsi hesabınıza daxil olunİlk depozit bonusuCanlıDepozitİlk depozit bonusuBet...

Must read

- Advertisement -

You might also likeRELATED
Recommended to you