I am coming back to UTAU and would to create a new Japanese and English UTAU voicebank and I would like advice.

TheJoshMorin

Teto's Territory
I have recently got back into Music Production and it has made me reminisce of the days where my main interest was UTAU and Vocaloid and now I'm interested in using them in my work. My childhood dream was to create a high quality Japanese and English voicebank and I have some questions that I would appreciate being answered. I will try to list out the question so they're easy to understand.

I live in an apartment building and I can't be too loud. I have a small recording booth I created in my closet. I use the Blue Snowball USB Microphone currently. I know how to use OREMO. The standard last time I was here was VCV or CVVC for Japanese and VCCV or Arpasing for English. What recording method is the most efficient in your opinion? specific recording lists would be appreciated too.

I have a decent understanding of how to configure the OTO of voicebanks but some of the more recent recording styles I have no idea how to OTO other than generating one with MoreSampler. Do people still use SetParam and if you suggest a recording list I would appreciate a link to a tutorial on how to configure the OTO and any other advice you may have.

I understand how to use the UTAU program itself inside and out but I've seen a program called OpenUTAU and I am wondering if that's what people are using now and if not are people still using the UTAU program by itself? Also if you've suggested a voicebank or recording style any plugins that would help me use the voicebank would be appreciated. Thank you for reading!
 

Thehyami

Ruko's Ruffians
Defender of Defoko
I made my own reclist, and in hindsight, it looks a lot like Delta reclist CVVC number 7. We use the same alias, which is X-SAMPA. You can use https://tophonetics.com to translate English words to X-SAMPA. I do not like CZ's VCCV because it is not non-native friendly. It is hard to know how to make the ust of the English words. While X-SAMPA is based on the IPA, the international phonetic alphabet, which is what the academic world and professional linguists use. Arpabet (The alphabet in Arpasing) is also academically used, but it is more specific to English and has fewer sound expressions. It is easy to find the equivalence of Arpabet to X-SAMPA though.

The difference between my reclist and Delta reclist is that my reclist is even more cut down. My reclist has no diphthongs and consonant clusters. I also only use the 10 vowels. In Delta there are 14 vowels, I don't have these 4 vowels: v+ (V), u+ (U), 3, and ju. And lastly, I only oto the CV parts. I'm not a native speaker, the reason why my reclist is so minimal is just because it is hard for me to pronounce them.

Here are some samples:
Lose Somebody by Kygo, OneRepublic
You'll Always Find Your Way Back Home by Miley Cyrus
You Belong with Me by Taylor Swift

I do not use BGM when recording, because it makes me feel rushed and I'll fail to pronounce correctly. If you are using the Delta reclist you'll find lines like this:
_a_pa_sa_ta_da-
To record this line, I suggest you to read it continuously as:
appassattadda
Then you can oto CV and VC (ap, pa, as, sa, and so on).

I manually oto them using setParam. I think the basic rules are just fine. I oto them like this, say you want to oto sa, when zoomed in, it'll look like this:
sssaaa
s|s|s|a|a|a
From left to right, the lines are left blank / offset, overlap, pre-utterance, consonant, and right blank / cutoff. For stop consonants like p, t, k, b, d, g, ts+ (tS), and dz+ (dZ), the offset and overlap will be close to the pre-utterance. Like this:
taaa
|||t|a|a|a
And for the VC, you just reverse the order:
aaasss
a|a|a|s|s|s
aaat
a|a|a|t||

I like Utau because it has more plugin support, and what I like the most in OpenUtau is that I can modify the overlap and preutterance of the notes directly without going to the properties. If Utau has this feature, I'll definitely recommend Utau to you.
 
Last edited:
  • Like
Reactions: TheJoshMorin

nneko

Teto's Territory
Defender of Defoko
(I will try my best to help!)
I live in an apartment building and I can't be too loud. I have a small recording booth I created in my closet. I use the Blue Snowball USB Microphone currently. I know how to use OREMO. The standard last time I was here was VCV or CVVC for Japanese and VCCV or Arpasing for English. What recording method is the most efficient in your opinion? specific recording lists would be appreciated too.
In my opinion, Japanese CVVC is quicker to record, but it takes more work to oto. VCV takes longer to record, but it's easier to oto.
There is an efficient VCV reclist available here on the forum which removes redundant and uncommon sounds:

This is the only CVVC resource I know of (I'm sorry!):

Tutorial for configuring CVVC (you'll have to use Google Translate):

I'm not too familiar with making English voice banks, but here are some reclists for Arpasing:
Here is a tutorial for configuring an Arpasing voice bank. Even if you use Moresampler, you will have to adjust the oto to make it sound smoother:
(Scroll down for tutorial)
https://arpasing.neocities.org/en/resources/vb-creation.html

VCCV English resource:


I understand how to use the UTAU program itself inside and out but I've seen a program called OpenUTAU and I am wondering if that's what people are using now and if not are people still using the UTAU program by itself?
OpenUTAU contains a few quality-of-life features, like:
A built in romaji to hiragana converter,
Phonemizers for different languages (like Arpasing English),
Multiple tracks,
(and more!)

At the moment, OpenUTAU does not have a way to oto voice banks, so you will have to use UTAU or SetParam.

I hope this helps in some way!
 
  • Like
Reactions: TheJoshMorin

TheJoshMorin

Teto's Territory
Thread starter
Thank you both for the help! All of this was very useful and I'm very grateful for it. I will take all of this into consideration. :love:
 

Similar threads