"Sung" voicebanks vs "Spoken" voicebanks

Obakebaka

Momo's Minion
I like to say to people that I introduce Utau to that you don't need to be able to sing, as long as you can pronounce the phonemes correctly.
But how much of that is true? How different can they both sound when the same voicer is used but the method changes? Are spoken banks better for talkloids wheras sung banks are better for singing? Or is the quality in general affected or not at all?

I'm looking for some opinions from people that have been in the Utau community for longer than I have or maybe just as long (or even less) that has more experience in general, as well as some additional aid in comparing the two tecniques to see if we can help further newcomers by providing more accurate information about how to make a GOOD utauloid no matter what they're used for.

Who's with me?
 

FastSpeedy

Teto's Territory
I believe (and I base that in my experience with the program) that when you sing the syllabes, it becomes more stable. Also, you stretch the syllabe (because you are not just saying it), which helps a lot to OTO it better.

When you say it, you may add some inflections to the syllabes, which are much less present when you sing them.

In my opinion and experience, of course. I hope it helped ^^
 

KitWistful

Teto's Territory
Defender of Defoko
I've never really identified a "spoken" voicebank! I know if the pitch isn't consistent then UTAU's processing will make it sound robotic...and I don't think it would be very good for talkloids since I'm pretty sure the pitch would have been flattened already. There'd be nothing to gain but the difference in inflection, like Speedy said. I think it might make a good append, maybe...something like Needle's speed-singing bank?
 

Fawkesy

Ruko's Ruffians
Defender of Defoko
All of my voicebanks are "spoken" only because  I can't hold out my voice while singing, and it hurts my throat ;o;.
But speaking everything is easier in general, it's not like you just say it like "a" you can hold it out while speaking too, to make it stable as well "aaaaaaaaa" It's not like singing is the only way to make a stable voicebank.
But this is just my opinion. I see no difference between spoken and sung voicebanks.
 

Ant

Teto's Territory
Defender of Defoko
I recorded halfway in between, in other words, long strings, but keeping the note strong pitch consistent.

I noticed today that Lizett is a "sung" bank and she sounds not much different from the others I was using at the same time. I don't think it affects the sound that much.
 

Cdra

possibly dead
Global Mod
Supporter
Defender of Defoko
Singing and speaking are quite different, pronunciation and stress wise.  This is particularly evident in a more complex language, like English.  In English, the base phonetics you have are actually different depending on if you're singing or speaking.  When people try to make a voicebank using spoken English phonetics, it has an extremely awkward accent while singing.  Conversely, some singers have pretty extreme "singing" accents, so when you try to make those banks talk, you hear how awkward it is. (on that last point, you should listen to Soledi sing and then Generic speak.  You'll wonder how they're the same voice.)

Too often, a "spoken" voicebank comes out sounding bored, a bit flat, and like I already said, awkwardly accented.  An UTAU is meant to sing; when you record the samples in a spoken mode, you create an odd disjunct between how they were made to sound and how they must sound based on what the program does.  Problem here is there are no examples I can think of of spoken banks because... I don't know of any that stick out.

You should really sing your voicebanks.  Even if you "can't sing", you can still, well, sing.  It doesn't have to be super-high-quality singing (that has some effect too but it's hard to measure), but if you can carry a tone through a sung note that's what you should be doing.  And even if you "can't sing" you can probably do that. ewe
 

Obakebaka

Momo's Minion
Thread starter
Really?
That's odd because the reason this topic came up in my head was because I was using both Halt Tanner's English and Japanese banks and thought her English one sounded more spoken wheras her Japanese one sounded more sung when I listened to the sound files.

And regarding pitch inflection in speech, I'm referring towards monotone speech. Kind of like what you get when you use a text to speech synthesis program. Or at least one of the older ones like the one used to make FL Chan. "I am a robot from outer space"-style speech.
 

Cdra

possibly dead
Global Mod
Supporter
Defender of Defoko
Obakebaka, I can assure you I sing all of Halt's voicebanks, even if they sound "spoken".

Oh!  I know what you're hearing, though.  Her English was recorded EXTREMELY quietly because my mom was in the next room trying to sleep.  It might be that my quieter, breathier singing sounds more like speech...  But it's not, it's just really muffled singing.  That's about what I mean by "even if you 'can't sing' you can probably do this", though.

I really need to rerecord that so I don't have exactly that problem (I hate that English bank).  SOON!
 

hopeandjoy

Ruko's Ruffians
Defender of Defoko
I sing my banks for the simple reason that UTAU is a singer snyth, so one should input the samples to create that sound since singing and talking are two different actions.

That said, if someone where to listen to my samples, they might sound spoken because they're typically steady notes at low tones.
 

LupinAKAFlashTH

DEX's Voice Provider. Woah.
Supporter
Defender of Defoko
Aster Selene link said:
Please don't speak your voicebanks.

Speaking banks creates a ton of pitch inflection, which is bad. The more the resampler has to fix your pitch, the more distorted it sounds. Sometimes it can't even fix it and the bank starts to go everywhere on pitch even with mod 0.

In addition spoken pronunciation is very different from sung pronunciation; your bank will sound very awkward and forced.

This.
Not only that but the less vowel UTAU has to stretch out the more distorted the vowel will sound on longer notes.
 

Raiyux

Ruko's Ruffians
Defender of Defoko
It's true that one of the differences between speaking and singing is that when you sing, you hold out the notes. But there's more to it than that...singing comes from deeper within you, you know--all of that stuff they teach you in choir. I think a speaking voice is generally defined by not using those parts that you use to sing. It all comes from up within your mouth, throat, etc. and that is what makes the pronunciation different. But I guess you don't necessarily have to have that... 'choral' sound to your voice to be considered singing. There are a lot of singers who are able to just belch out their notes and still sound great.

My WIP UTAU, Castella is sort of in-between on the speech-to-singing scale. When I was recording her, I didn't just force sound out as if I were speaking, but I didn't completely sing the samples, either. Reason being, my singing voice is just too quiet and breathy. But the result I got--she sound's like she's singing, but she has a balanced power-level in her voice. As for my other UTAU, Fuu, she totally came from that 'speech' side of my voice. Almost yelled, in a way. Her sound is more nasal than anything. She still sounds like she's singing, but yet she doesn't sound dull, at all. It's weird, but she sounds...happy. Hence, moe voice. XD

I say it all just depends on the voicer! If they have a dull voice, the UTAU will sound dull. If they are cheery sounding, the UTAU will sound cheerful. It's all a matter of people's unique sounds and their ability (or inability) to cast different inflections on their voice!

In my opinion, whether you sing or speak your recordings, I say; if it sounds good, just go with it. If it doesn't, try something else. ^^
 
  • Like
Reactions: CRTブラック

Similar threads