How to create a realistic sounding Utau?

FeatheredFinch

Ritsu's Renegades
Defender of Defoko
Obviously the first thing we think is more pitches but I've heard banks like "Al!ce" and such sound incredibly realistic only having two pitches! I think it's the way we record / the microphone quality but also recording farther away from your mic so it creates a more crisp sound which I find works a lot better in utau.

what do you all think?
 

FeatheredFinch

Ritsu's Renegades
Defender of Defoko
Thread starter
Maybe don't "force" your recordings? I recorded in like the most normal way I could. Otos are also a big factor here/
True but I mean the crispness of the voice, a lot of Utaus sound very nasally (Tonyus banks do) or too muffled.
 

Sors

Local Guppie & UTAU Korean Advocate
Tutor
Defender of Defoko
Oto, and using the range well, crossfading, VCV or CVVC. Definitely not English as they never really sound realistic due to the consonants. The most realistic results are usually achieved with as few ending consonants as well...tho this applies more to using a vb, not creating one...

But then.again, the usage is one of the most important factors as well!
 

VocAddict

The Voice Within Us
Defender of Defoko
Nasality is based on the voicer and the tone you want to achieve so there's not much you can do about that.

Recording the samples more naturally (in both tone and pitch) will obviously give the synthesised output a more realistic sound.

Choosing pitches to record is extremely important to consider as the transition from one pitch to another affects the overall tone. As a result, a bi-pitch bank can sound better than a 6-pitch bank.

The resampler and wavtool used, and the tuning done (enveloping, flags, etc.)

And of course, the equipment used to record the bank and the environment in which it was recorded. And obviously the oto.
 

FeatheredFinch

Ritsu's Renegades
Defender of Defoko
Thread starter
Nasality is based on the voicer and the tone you want to achieve so there's not much you can do about that.

Recording the samples more naturally (in both tone and pitch) will obviously give the synthesised output a more realistic sound.

Choosing pitches to record is extremely important to consider as the transition from one pitch to another affects the overall tone. As a result, a bi-pitch bank can sound better than a 6-pitch bank.

The resampler and wavtool used, and the tuning done (enveloping, flags, etc.)

And of course, the equipment used to record the bank and the environment in which it was recorded. And obviously the oto.
What pitches would work best?(planning a new voicebank since I'm retiring Tonyu)
 

Sors

Local Guppie & UTAU Korean Advocate
Tutor
Defender of Defoko
What pitches would work best?(planning a new voicebank since I'm retiring Tonyu)

Well, I did it this way:

I recorded 10 samples in a low voice, made the frq. Map in UTAU, the averagw was G#2 > Pitch was G#2. Normal soft voice the same > F3, High-Normal Powerful voice > C#4 and very high and very powerful F4.
 
  • Like
Reactions: FeatheredFinch

PrinceofHades

A wandering soul
Defender of Defoko
As far as pitches go, the most important thing to remember is that if your first pitch is C4, your other pitch should not be C#4 or B3 since they are right next to each other. You would probably be better off choosing D4 for the higher pitch, and either A#3 or A3 for your lower pitch. Basically, do not have all your pitches right next to each other. The reason for this is because there would be too much overlap between the transitions and that would defeat the purpose of multipitch, which is to expand range, tonality or both.

Hope this helps!
 

Sors

Local Guppie & UTAU Korean Advocate
Tutor
Defender of Defoko
As far as pitches go, the most important thing to remember is that if your first pitch is C4, your other pitch should not be C#4 or B3 since they are right next to each other. You would probably be better off choosing D4 for the higher pitch, and either A#3 or A3 for your lower pitch. Basically, do not have all your pitches right next to each other. The reason for this is because there would be too much overlap between the transitions and that would defeat the purpose of multipitch, which is to expand range, tonality or both.

Hope this helps!

Yep, but don't go too much out of range...even tho that can increase the Range of the UTAU in a ridiculous way.
E.g. Sora's CV monopitch Range: A2-A4
Sora's CVVC 4-Pitch: F1-D6...
LIKE THAT RANGE IS SOOO RIDICULOUS XD
 
  • Like
Reactions: PrinceofHades

FeatheredFinch

Ritsu's Renegades
Defender of Defoko
Thread starter
As far as pitches go, the most important thing to remember is that if your first pitch is C4, your other pitch should not be C#4 or B3 since they are right next to each other. You would probably be better off choosing D4 for the higher pitch, and either A#3 or A3 for your lower pitch. Basically, do not have all your pitches right next to each other. The reason for this is because there would be too much overlap between the transitions and that would defeat the purpose of multipitch, which is to expand range, tonality or both.

Hope this helps!
Would C3, A3, F3, C4 would that work?
 

Hazu パワ

Weeaboo yet not Weeaboo
Defender of Defoko
I'm going to try to explain this the best that I can-

First off- Learn how the voice works. A human voice- then learn how your voice works. You can have the the most expensive equipment and the best oto in the world but if you don't record correctly- the bank wont sound natural.

1: Are you a singer? If not- practice being one- all natural banks (Examples: Meiji, Renri and Ritsu) all have their voice's provided by Vocalist. Even Vocaloids- you can tell the difference. with Miku who is voiced by Fujita who can sing but is more experienced in voice acting while Gumi who sounds tones more natural is voice by Nakajima who is a very experienced Vocalist.

2: You don't need to be a top notch vocalist to actually provide a natural voice for a bank- just need to kinda know the ropes. Understand how your singing voice works and changes in tone while hitting different pitches. (Example : When I sing and I start getting higher and start using more power I kinda get raspy[good rasp not bad rasp] and so I captured that in my bank by adding a pitch that actually doe's that so now the bank actually can sing the way I actually sing.)

3: Sing your samples don't just say them.
 

PrinceofHades

A wandering soul
Defender of Defoko
Would C3, A3, F3, C4 would that work?

-had to check the piano in his living room-
Yup, that would work. C3 is a little low, you might want to go for C#3 because the gap between F3, A3, and C4 is approximately 3 notes. The gap between C3 and F3 is almost 5. However, I think it could still work.

Edit:
Actually, check VocAddict's guide. I'm basing this entirely off of music theory and not paying as much attention to UTAU so definitely check the guide VocAddict posted.
 

Alessandra

Ruko's Ruffians
Defender of Defoko
Obviously the first thing we think is more pitches but I've heard banks like "Al!ce" and such sound incredibly realistic only having two pitches! I think it's the way we record / the microphone quality but also recording farther away from your mic so it creates a more crisp sound which I find works a lot better in utau.

what do you all think?
Maybe also how you pronounce...
 

HoneyPai

Defoko's Slaves
Defender of Defoko
I'd say a lot of it depends on the way utau(resamplers) and your mic respond to your voice type. Obviously the snowball responds fantastically to Alice's voice. But it may not be the same with other voice types. Sure pitches can help but even as you said, Al!ce sounds wonderful with two pitches.
Otos are important but I think it's more of a voice thing in this situation. Because you could have perfect otos, perfect pitches for your voice, and even an HQ mic. But it doesn't mean that your utau will sound realistic.