Actual Whisper Bank - is it possible?

LunarConstruct

Ritsu's Renegades
Defender of Defoko
So I have a soft/whispery kind of bank, but I want to record one where it actually sounds like she is just whispering rather than singing softly (if that makes sense).
I tried recording a bank with me just whispering everything, but UTAU just didn't see it or something... it would see that the files were there but it thought that they were blank so it doesn't play any sound.
Does anyone know how to make a whispering bank work?
 

Zoku

making doper vocaloid music than the rest
Defender of Defoko
If you want a bank that's literally whispering, I think that it would sound almost metallic (depending on resampler). Whispering gives off so a lot of air and it will reflect in the sample. From my experience, your samples will sound more static-y/metallic when there's more air in the recording.
 

na4a4a

Outwardly Opinionated and Harshly Critical
Supporter
Defender of Defoko
I dont know exactly but this is my loose grasp on this. So this isn't going to be accurate.

Short answer: Probably not.

Long answer: Probably not but it is possible with plenty if work and trial amd error.

Even longer answer (with typos):
A true whispering voicebank is not possible in normal circumstances because there is no discernible voiced frequencies, just unvoiced noise.
A resampler works on the idea of pitch shifting, it uses the tone of your voice to determine the pitch and other useful information about you voice (or whatever the sample consists of).

This information is probably also used when stretching the voice. If you have unwanted sounds in a sample or even just excess breath this noise can become distorted as the resampler tries to figure out and fill the gaps in the sound.
However looping samplers don't have stretching distortion because it simply loops over the segment of the sample.

Of course there are other things that can go wrong but I'm not a wizard and can't even begin to explain.

However, if you still have some voice in the sample AND your samples are very clean then it will give the resampler some information to go off of.
Think of those "asmr" videos where the voice is sometimes audible and other times it's simply air that can be recognized as vowels. you'd want to match the audible whisper.
This will give you a bank that is a whisper but is fully functional and can be stretched/looped and shifted in pitch.

Or another option that may work is to treat it as you would a breath. In this case your voicebank is completely unvoiced. You would want a voiced vowel at the beginning and/or end of the sampler to ensure frq file generation works as intended. Then you will want to use a looping sampler (like tn_fnds) when rendering and stretching samplers like fresamp and resampler will distort it to a harsh shrieking noise.
This kind of bank would be monotonous though.

bkh_01 would also be an interesting candidate. Even if it doesn't work as intended you could still use it's render to fine tune your oto. bkh_01 kinda works in the idea of the contents of your voice rather than the sampler itself (?) so it can be hit or miss.
Also those samples would need to be flawless.

But what also would be an option is making the whisper samples artificially.
you can use something like bkh_01 to remove the voice from the noise amd essentially get whispering.
There are also options for removing the voiced tone from the wave file itself, in that case you could generate the frw files beforehand and use the modified files after.

So it's possible but you got to decide how you are going to achieve it...I guess...
Sorry if this isn't very helpful.
[doublepost=1445579734][/doublepost]forgot to mention:

Resampler: stretching
Fresamp: stretching
TIPS: stretching
tn_fnds: looping
EFB-GT/EFB-GW: looping*
bkh_01: magic
w4u: stretching (pretty sure)
M4: stretching

*it loops but actually goes into reverse rather than looping from the beginning. So it's kinda "back and forth".
 
Last edited:
  • Like
Reactions: Zoku and Kiyoteru

GothAmaterasu

Ruko's Ruffians
Defender of Defoko
I honestly think I made a decent whisper bank c:

I suppose it's not pure whispering, there's still some voice in there, but the files aren't corrupted at least xD
 

LunarConstruct

Ritsu's Renegades
Defender of Defoko
Thread starter
I dont know exactly but this is my loose grasp on this. So this isn't going to be accurate.

Short answer: Probably not.

Long answer: Probably not but it is possible with plenty if work and trial amd error.

Even longer answer (with typos):
A true whispering voicebank is not possible in normal circumstances because there is no discernible voiced frequencies, just unvoiced noise.
A resampler works on the idea of pitch shifting, it uses the tone of your voice to determine the pitch and other useful information about you voice (or whatever the sample consists of).

This information is probably also used when stretching the voice. If you have unwanted sounds in a sample or even just excess breath this noise can become distorted as the resampler tries to figure out and fill the gaps in the sound.
However looping samplers don't have stretching distortion because it simply loops over the segment of the sample.

Of course there are other things that can go wrong but I'm not a wizard and can't even begin to explain.

However, if you still have some voice in the sample AND your samples are very clean then it will give the resampler some information to go off of.
Think of those "asmr" videos where the voice is sometimes audible and other times it's simply air that can be recognized as vowels. you'd want to match the audible whisper.
This will give you a bank that is a whisper but is fully functional and can be stretched/looped and shifted in pitch.

Or another option that may work is to treat it as you would a breath. In this case your voicebank is completely unvoiced. You would want a voiced vowel at the beginning and/or end of the sampler to ensure frq file generation works as intended. Then you will want to use a looping sampler (like tn_fnds) when rendering and stretching samplers like fresamp and resampler will distort it to a harsh shrieking noise.
This kind of bank would be monotonous though.

bkh_01 would also be an interesting candidate. Even if it doesn't work as intended you could still use it's render to fine tune your oto. bkh_01 kinda works in the idea of the contents of your voice rather than the sampler itself (?) so it can be hit or miss.
Also those samples would need to be flawless.

But what also would be an option is making the whisper samples artificially.
you can use something like bkh_01 to remove the voice from the noise amd essentially get whispering.
There are also options for removing the voiced tone from the wave file itself, in that case you could generate the frw files beforehand and use the modified files after.

So it's possible but you got to decide how you are going to achieve it...I guess...
Sorry if this isn't very helpful.
[doublepost=1445579734][/doublepost]forgot to mention:

Resampler: stretching
Fresamp: stretching
TIPS: stretching
tn_fnds: looping
EFB-GT/EFB-GW: looping*
bkh_01: magic
w4u: stretching (pretty sure)
M4: stretching

*it loops but actually goes into reverse rather than looping from the beginning. So it's kinda "back and forth".


thanks, I'll try some of this out when I get home tonight. Hopefully I can get it to work...
 

Similar threads