Preprint / Version 1

Voice-Changing Detection with Convolutional Neural Network

##article.authors##

  • Chuntung Zhuang

DOI:

https://doi.org/ 10.47611/harp.139

Keywords:

voice-changing

Abstract

Voice-changing is a voice transformation technique that directly modifies a voice’s pitch, tone, and timbre. Like other types of voice transformation, voice-changing is threatening our society from an economic, social, and legal perspective; hence in this study, we tackle this problem by adopting a convolutional neural network. As far as we know, our proposed work is the first solution to distinguish voice-changed voices and authentic voices. In addition, we also conducted experiments and gained significant insights that can help facilitate further research into voice-changing and audio-related machine learning. Foremost, we find the architecture with four convolutional layers to be the most effective. Second, we find a strong and positive correlation between the training data size and the performance in terms of accuracy and stability. Finally, we discovered that different training languages yield very different  performances, with Chinese far outweighing English as a training language. We were inspired by the findings of different training languages, which motivated us to conduct a supplementary experiment: we concluded that tone is one factor that contributes to making a training language better in spotting voice-changed voices.

Downloads

Posted

2022-03-31