GAN based Vocoders demo



Reference papers

Parallel WaveGAN

Oord, A. V. D., Dieleman, S., Zen, H., Simonyan, K., Vinyals, O., Graves, A., ... & Kavukcuoglu, K. (2016). Wavenet: A generative model for raw audio. arXiv preprint arXiv:1609.03499.

Yamamoto, R., Song, E., & Kim, J. M. (2020, May). Parallel WaveGAN: A fast waveform generation model based on generative adversarial networks with multi-resolution spectrogram. In ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 6199-6203). IEEE.

MelGAN embedding

Kumar, K., Kumar, R., de Boissiere, T., Gestin, L., Teoh, W. Z., Sotelo, J., ... & Courville, A. C. (2019). Melgan: Generative adversarial networks for conditional waveform synthesis. In Advances in Neural Information Processing Systems (pp. 14910-14921).



Github code:
https://github.com/CODEJIN/GAN_based_Vocoders



  Original G:PWGAN, D: PWGAN SR:16K G:PWGAN, D: PWGAN SR:24K G:PWGAN, D: MelGAN SR:16K
24K source trained
22.05K source trained
16K source unseen
24K source unseen
Non-English