How do vocal synths work?

Kiyoteru

UtaForum power user
Supporter
Defender of Defoko
If you're asking about UTAU specifically, it's sample manipulation. Prerecorded segments of audio are rearranged and pitch shifted. Nothing new is really being generated from scratch, or out of thin air.
I am aware that some resamplers take a bit more of a synthesis approach, where the original voice recording is deconstructed into component parts (eg. the pitch, the vocal tone, noisy parts like consonants and breathiness), then only some of those are modified (eg. the pitch), then the vocal is reconstructed into output audio. It's still relying on having those original recorded samples to begin with. But in the case of something like Moresampler, once the audio has been analyzed and an LLSM file has been created, it can use just that LLSM file to create audio output without needing the original audio files anymore.
 

Kiyoteru

UtaForum power user
Supporter
Defender of Defoko
That's formant shifting. Formants are also known as vocal harmonics or overtones. Vocal processing software, such as Autotune, can shift pitch and formants independently of each other.
 

Kiyoteru

UtaForum power user
Supporter
Defender of Defoko

Similar threads