# NVIDIA’s New AI: Stunning Voice Generator!

## Метаданные

- **Канал:** Two Minute Papers
- **YouTube:** https://www.youtube.com/watch?v=fj-Ipgw9kl8
- **Дата:** 26.11.2024
- **Длительность:** 6:21
- **Просмотры:** 149,915

## Описание

❤️ Check out Weights & Biases and sign up for a free demo here: https://wandb.me/papersllm

📝 The blog post and paper are available here:
https://blogs.nvidia.com/blog/fugatto-gen-ai-sound-model/
https://d1qx31qr3h6wln.cloudfront.net/publications/FUGATTO.pdf

Voice isolation (with timestamp): https://youtu.be/qj1Sp8He6e4?si=ZtSesU1e7jeoN55U&t=63

📝 My paper on simulations that look almost like reality is available for free here:
https://rdcu.be/cWPfD 

Or this is the orig. Nature Physics link with clickable citations:
https://www.nature.com/articles/s41567-022-01788-5

🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:
Alex Balfanz, Alex Haro, B Shang, Benji Rabhan, Gaston Ingaramo, Gordon Child, John Le, Juan Benet, Kyle Davis, Loyal Alchemist, Lukas Biewald, Martin, Michael Albrecht, Michael Tedder, Owen Skarpness, Richard Sundvall, Taras Bobrovytsky,, Thomas Krcmar, Tybie Fitzhugh, Ueli Gallizzi.
If you wish to appear here or pick up other perks, click here: https://www.patreon.com/TwoMinutePapers

My research: https://cg.tuwien.ac.at/~zsolnai/
X/Twitter: https://twitter.com/twominutepapers
Thumbnail design: Felícia Zsolnai-Fehér - http://felicia.hu

#nvidia #fugatto

## Содержание

### [0:00](https://www.youtube.com/watch?v=fj-Ipgw9kl8) Segment 1 (00:00 - 05:00)

this is fugato nvidia's new Ai and they say it is the Swiss army knife of sound but why does this even make sense there are previous AI systems that can generate sound effects even songs so what is new here well check this out oh goodness so previous systems can do a lot but none of them can really do everything but this one can so they say well let's have a look together through five amazing examples and see for ourselves first text to sound just write what you wish to hear and there you go hopefully it does not work for a two-minute paper script to put me out of business we'll see about that two text to sound but this time crazier a train passing by and the sound morphing into a string orchestra this sounds impossible let's listen wow now let's do something crazier wait what synthesizing someone talking kids are talking by the door okay I am going out of business but wait maybe not yet because some of you say that the best part is that I put so much emotion and enthusiasm into these videos that's all always going to work well check this out kids are talking by the door oh goodness that is really convincing not perfect yet but if you apply the first law of papers yes a couple more papers down the line and it will surely be excellent now three on to musicians imagine that you have a little something down but not a full song now let's end drums to it but it gets crazier way crazier dear fellow Scholars this is two minute papers with Dr car forget this four this is going to sound like science fiction just play something through the piano and transform it into a singing female voice with a tool like this now we can all become musicians and we don't need expensive hardware and instruments to make it happen that is so cool or five this is one of my favorites remember they promise emergent properties that means that it can combine two things without having heard that combination that would be the definition of some sort of intelligence for me so I am super excited about this for instance it can put together a mix of instruments arranged in unexpected ways this comes out nearly instantly so now hold on to your papers fellow Scholars and just let your imagination run wild with a howling saxophone or of course let the memes flow with this one a dog barking over electronic music Fato is a groundbreaking Foundation model that gives you Sonic superpowers it can also perform voice isolation so when you drop in a full song it can isolate and get out just the singing or for karaoke purposes just the instruments I am unable to show you this here due to YouTube reasons but of course I put a link to it in the video description and we are just getting started as the research paper reveals some unexpected insights look this is incredible the new system fugato is as good as other generally systems that can do multiple things but what about the Specialists AIS that can do only one thing but they do that very well those are the best right well hold on to your papers fellow Scholars because occas Al this one can beat those two which is absolutely stunning a generalist system beating Specialists that is crazy these systems are super optimized for one thing and the generalist can still win imagine an Olympic swimmer suddenly walking over and winning a medal in wrestling that would be probably impossible and that sort of thing happening here in AI is absolutely insane what a time to be alive and here comes perhaps the best

### [5:00](https://www.youtube.com/watch?v=fj-Ipgw9kl8&t=300s) Segment 2 (05:00 - 06:00)

part oh my goodness look at that these don't require super computers these are tiny models two of the three AI systems may be small enough to run even on your phone I can't believe that so cool and just imagine combining it with chpt to write the lyrics for you and the lyrics here and you have your own song or just imagine combining it with the notebook LMI that can create text or even a podcast about a technical research paper and before you ask yes it is me in human form speaking into the microphone trying to hold on to my papers and flipping out every episode so what do you think what would you fellow Scholars use this for let me know in the comments below we need new tools for the era of llms and weights and biases now has weave a lightweight toolkit to conf confidently iterate on llm applications use traces to debug how data flows through each step of your app and use evaluations to measure your progress it is the best try it out now at wbme SLP papers llm or click the link in the description below

---
*Источник: https://ekstraktznaniy.ru/video/17230*