Jul 17, 05:00 PM

Nvidia Releases BigVGAN v2 Neural Vocoder, Transforming Audio Synthesis

Nvidia has released BigVGAN v2, a state-of-the-art neural vocoder designed to transform audio synthesis. The new model features a custom CUDA kernel for inference, which includes fused upsampling and activation kernel, resulting in up to 3x faster inference on A100 GPUs. Additionally, BigVGAN v2 boasts improved discriminator and loss functions, utilizing a multi-scale sub-band CQT discriminator and a multi-scale mel spectrogram loss. BigVGAN v2 is a Mel spectrogram to waveform generator. This release is expected to significantly advance the field of audio synthesis.

#Nvidia #BigVGAN #Mel

Written with ChatGPT (GPT-4o).