AutoVC demo



Reference paper:
Qian, K., Zhang, Y., Chang, S., Yang, X., & Hasegawa-Johnson, M. (2019).
AutoVC: Zero-shot voice style transfer with only autoencoder loss.
arXiv preprint arXiv:1905.05879.


Github code:
https://github.com/CODEJIN/AutoVC



Official code replication

  Source   Style Conversion
VCTK P270
(Male)
VCTK P256
(Male)
VCTK P228
(Female)
VCTK P225
(Female)
VCTK P256
(Male)
VCTK P228
(Female)

 



Results



Used datasets

Is trained Language Sex Dataset / Speaker
O English Male VCTK / P226
X Englsih Male NUS-48E / ZHIY
X Korean Male Not dataset
O Englsih Female VCTK / P276
X Englsih Female LJSpeech
X Korean Female KSS


Trained English Male (Speaker: VCTK 226)

  Source   Style Conversion
Trained Eng Male Trained Eng Male
(self)
Unseen Eng Male
Unseen Kor Male
Trained Eng Female
Unseen Eng Female
Unseen Kor Female

 

Unseen English Male (Speaker: Not dataset)

  Source   Style Conversion
Unseen Eng Male Trained Eng Male
Unseen Eng Male
(self)
Unseen Kor Male
Trained Eng Female
Unseen Eng Female
Unseen Kor Female

 

Unseen Korean Male (Speaker: Korean talker)

  Source   Style Conversion
Unseen Korean Male Trained Eng Male
Unseen Eng Male
Unseen Kor Male
(self)
Trained Eng Female
Unseen Eng Female
Unseen Kor Female

 

Trained English Female (Speaker: VCTK ?)

  Source   Style Conversion
Trained Eng Female Trained Eng Male
Unseen Eng Male
Unseen Kor Male
Trained Eng Female
(self)
Unseen Eng Female
Unseen Kor Female

 

Unseen English Female (Speaker: LJSpeech)

  Source   Style Conversion
Unseen English Female Trained Eng Male
Unseen Eng Male
Unseen Kor Male
Trained Eng Female
Unseen Eng Female
(self)
Unseen Kor Female

 

Unseen Korean Female (Speaker: KSS)

  Source   Style Conversion
Unseen Korean Female Trained Eng Male
Unseen Eng Male
Unseen Kor Male
Trained Eng Female
Unseen Eng Female
Unseen Kor Female
(self)