What are the factors of "easy-to-use" and "natural sounding" UTAU

Nohkara

Pronouns: He/him
Supporter
Defender of Defoko
We many know that some UTAUs are more easier use than some another UTAU even if exactly same script (recording list), recording tempo and (nearly) same recording equipments etc.

For example, in my opinion Yamine Renri sounds very good in default: even in minimally tuned or in not-tuned UST, she sounds very fine (I'm not saying that she doesn't need any tuning but there are some UTAUs like Meiji, Ruko or Ritsu Kire that actually NEEDS more tuning/tweaking to sound as fine).

What are the factors of "easy-to-use" and "natural sounding" UTAU?

Please, try to answer something else than "it's because VCV/multipitch", there must be other factors as well!
 

Sors

Local Guppie & UTAU Korean Advocate
Tutor
Defender of Defoko
I think easy to use is really CV, and Monopitch.

Another factor is recording quality, take for example defoko vs ruko; the clearity is a big role.

Also, MultiAppends which require the suffixbroker make it hard to work with.

Then again, it really mostly depends on the format. CV is easier to use than VCV and a lot easier than CVVC. Also, for English, it depends if you use Arpasing or VCCV, since Arpasing requires a lot of tuning. It really mostly depends on the reclist used and the overall quality.

The only thing that makes really a difference is English Aliasing for VCCV. E.g. some are have only CC samples, others even CCC, or VCC, wjile others only VC samples. E.g. the sentence 'why can't it be perfect' could be '_wI k@ @ n @nt ti it- bE p3 fi ikt-' or 'wI k@ @ n nt ti it- bE p3 fi i k kt-'
 

Kiyoteru

UtaForum power user
Supporter
Defender of Defoko
To generalize, a particular reclist will result in a voicebank that's easier to use if it has more phonemes per note. Diphone banks are horrible and finicky, because you must have lots of very tiny notes in a UST just to reconstruct things like consonant clusters. On the other hand, if many phonemes are already in one note, no additional messing around is necessary. This is why I think VCV English banks like Adrian are so fun and simple to use. Many complex sequences of phonemes are already put together for me.
 

Similar threads