Ever wanted to hear a saxophone bark? Nvidia just made the ‘worldâs most flexible sound machineâ that uses AI to blend music, voices and sounds

Nvidia has announced a new generative AI audio tool called Fugatto, which it’s describing as the “world’s most flexible sound machine” – capable of producing all kinds of music, speech, and other audio, and even unique sounds that have never been heard before.

Fugatto, which is short for Foundational Generative Audio Transformer Opus 1, can work with text prompts and audio samples. You can simply describe what you want to hear, or get the AI model to modify or combine existing audio clips.

For example, you can have the sound of a train transform into a lush orchestral arrangement, or mix a banjo melody with the sounds of rainfall. You can hear the sound of a saxophone barking, or a flute meowing, just by typing in a prompt.

Fugatto can also isolate vocals from tracks, and change the vocal delivery style, as well as generate speech from scratch. Feed in an existing melody, and you can have it played on whatever instrument you like, in any kind of style.

The bad news – it’s not available yet

Audio AI Fugatto Generates Sound from Text | NVIDIA Research – YouTube

Watch On

So how can you try out this impressive new AI technology? You can’t, for the time being: you’ll have to make do with Nvidia’s promo video and a website of samples. There’s no word yet on when Fugatto will be available for public testing.

Some of the samples published by Nvidia include the sound of a female voice barking, a factory machine screaming, a typewriter whispering, and a cello shouting with anger. You can see the wide variety of audio effects that are possible.

Nvidia has also demonstrated how the AI engine is able to produce spoken word clips, which can then be delivered with a range of different emotions (from angry to happy) and even with different accents applied.

“We wanted to create a model that understands and generates sound like humans do,” says Nvidia’s Rafael Valle, one of the Fugatto team. “Fugatto is our first step toward a future where unsupervised multitask learning in audio synthesis and transformation emerges from data and model scale.”

Ever wanted to hear a saxophone bark? Nvidia just made the ‘worldâs most flexible sound machineâ that uses AI to blend music, voices and sounds

The bad news – it’s not available yet

Nvidia confirms that an RTX 5070 Founders Edition is coming… just not on launch day

Nvidia looks to be revamping RTX 3050 mobile GPU with current-gen tech â and that might mean better battery life for budget gaming laptops

Quordle hints and answers for Monday, January 5 (game #1442)

Quordle hints and answers for Monday, April 14 (game #1176)

You might also like

The modern version of a fly trapped in amber: 3D printing your Gaussian Splat is one of the most amazing things I’ve seen in a long time

‘Deepfake as a service’ sees 39% spike in dark web conversations â and experts fear it will fuel the next wave of âfake bossâ scams

Vulnerabilities uncovered in secret US government systems and software during testing of Anthropic Mythos

The modern version of a fly trapped in amber: 3D printing your Gaussian Splat is one of the most amazing things I’ve seen in a long time

‘Deepfake as a service’ sees 39% spike in dark web conversations â and experts fear it will fuel the next wave of âfake bossâ scams

I watched Supergirl starring Milly Alcock and Jason Momoa — and the DC comic-book movie isn’t the terrific Woman of Tomorrow adaptation I’d hoped for

I’m back into vinyl in a big way in 2026, and Fluance’s top cheap turntable is 20% off for Prime Day in this turntable deal, but only for 12 hours (so hurry!)

Amazon’s Zoox unveils redesigned robotaxi ahead of upcoming expansion

Vulnerabilities uncovered in secret US government systems and software during testing of Anthropic Mythos