Basic Multipitch Bank Tutorial !
Hey guys, there seems to be little to no resources out there on how to make a multipitch UTAU bank, since some people asked, I decided to put together a little guide to help those who may be interested in creating one, or to quickly gleam over some information to see if it's a worthwhile endeavor for you. I would like to thank Aster/Fuzzle who helped, and started me off with the basic information on multipitch bank creation (since ten million years ago.) when I asked her on Twitter.
First off, I want to clear up some common misconceptions and cover some general info from the get go-
- Multipitch is not exclusive to full VCV banks - you can create CV multipitch as well, and lite VCV, or CVVC, if you are not comfortable with the full VCV territory.
- The amount of pitches (and the type of pitches, recording style, distance between pitches) you choose to record is completely up to you.
However, I personally do not recommend that you record two pitches right next to each other (E3/F3 for example) unless there's a big difference in power or you intend to use it as an alternative pitch in your bank. I also do not recommend recording two consecutive pitches super far from each other, as there will be a violent jump from one to the next.
- It's generally not recommended to record too many pitches (I have heard that 6 pitches is a recommended stopping point), because the note range capacity of a single pitch can usually cover a lot of ground on its own. I personally think a well done tripitch voicebank is enough to make a very good, well rounded range for an UTAU's voice.
Consider the general range of the monopitch bank. Multipitch should be an extension of range/power/expression, not muddle/confuse the bank's original capabilities) However, extra modes in a multipitch bank might be something to consider if your voicebank has additional/alternative "modes" that a user can toggle. (whisper, shout, falsetto, headvoice, growl, etc)
- "Powerscale" is one style of multipitch, (Ritsu's "KIRE" bank is a notable example of that; the bank increases in power gradually according to the climb of pitch, most powerful sound is at top, quietest sound is at the bottom) "Reverse-Powerscale" is where the softest sound is at the top, and the loudest sound is at the bottom.
You do not HAVE to record your multipitch strictly in that style. People have done variations on it, and there are many ways you can approach it. There are also multi-pitch banks where there is no apparent fluctuation in vocal strength, and solely serves as a range-extension of the monopitch bank.
- Multipitch is controlled by a tool in UTAU called the prefix.map. (You access it by going to Tools > Prefix.map) This tool lists all the pitches and allows you to assign the suffixes according to the notes. (and prefixes, too, but multipitch employs suffixes for the most part, I haven't had to use prefixes yet)
Okay, now that we have some general information out of the way, let's get started. I will demonstrate the basics of how to make a Japanese multipitch bank. I will make a test voice bank as I go along to show you, for the sake of having a concrete example. I will make a simple tri-pitch multipitch/powerscale example. (similar principles can be applied for more or less pitches.)
(I will assume you know how to operate the UTAU program; load voice banks, record, oto, use voice banks, etc)
PART 1 > Starting Out
To begin with, decide what format and reclist you want to use: CV/liteVCV/CVVC/full VCV. Romaji or Hiragana reclist does not matter. (liteVCV can produce results that are similar to full VCV with far less recordings, if you are looking into trying but are cautious about it. CV VC offers more control over consonants and I believe has less recordings than VCV as well, but it's more of a hassle to edit USTs for.)
Then, decide on the pitch you're most comfortable with for starters.
To do this, start on the first recording (a, a-a-i-a-u-e-a, etc) with your default voice (unless your UTAU is voice acted, then do the "easiest" reachable voice for them) and figure out the pitch of that recording. (If you are planning to add an additional pitch on top an already existing monopitch bank, you can skip over to the next steps)
There is a "Pitch Guide" in OREMO to help with that, but if you don't use OREMO to record, you can use this useful flash application to figure out what pitch your recording is at.
I made my recording and the pitch is E3. (Then, record the rest of your bank and give it an oto.ini. You should have a full monopitch bank by the end of this step.)
PART 2 > Assigning the Suffix
Since this is your starting pitch, it is up to you on whether you want to add a suffix to the oto.ini or not. (That means, when the prefix map is entirely clear, the voicebank will default to that pitch no matter how high or low the UTAU sings).
I left the starting pitch's suffix blank for my own UTAU's multipitch bank (Electrolysis, it defaults to Galvan (D3) when I have no suffixes on the prefix map), but for the purposes of this tutorial, I will give it a suffix.
I reccomend using Tady's oto.ini Suffixer Tool for this next step (Read his instructions on how to use the tool), as it will quickly add suffixes to your bank's oto.ini.
Give your default pitch a suffix. I gave my start pitch the "E3" suffix.
The oto.ini should look like this after putting it through his Tool.
It's typical of people to use the letter and number of the pitch/octave they recorded at (C3, F4 etc), but you are more than welcome to suffix it differently. I've seen people use letters or kanji to indicate different modes of a voice bank (like "w", "s", etc).
PART 3 > Recording Additional Pitches
Now, I'm going to record two more pitches for this test bank. Try not to use different reclists when recording new pitches, just stick to the same one.
Make new folders for each separate pitch that your bank will have - like so, and put the separate oto.ini file in those folders:
Each separate folder should contain everything that is expected of a finished UTAU voicebank. The .wav files, .frq files, and the oto.ini. The only difference is the oto.ini (which differs in configuration and suffix pitch assignment). You can keep other files in the main folder (or extra folder), like concept art, breath files, icon, readme, etc.
Since I intend to make this tri-pitch and powerscale, I have recorded B3, then F4 pitch. B3 will serve as a transitory pitch with a bit more notable power than the E3 pitch, and F4 will be a high strong pitch.
Once you have those pitches done, place the sound files in their respective folders- then you can use the same oto.ini (just copy paste those into the folders) as you did for your starting pitch, and just do a quick find+replace operation in the oto.ini file to replace the suffix with the appropriate one after you place copies into their respective folders. Or, you can use Tady's OTO suffixer tool again.
PART 4> Using a Multipitch Bank in UTAU
Loading a multipitch into the UTAU program (if you're not keeping it in the voice directory) is not initially straightforward, since if you go into one of the pitch folders to select the sound file, you find that UTAU will end up loading only that pitch's folder. To remedy this, just leave a sound file out in the main folder outside of the pitch folders. (breath sample can work), and click on that to load the entire bank - then the UTAU program will load all the folders from the voicebank.
If the entire voicebank is aliased and you need to use the VCV/CV converter for a UST, you might have to switch to another VCV or CV voicebank to convert the UST to fit your UTAU.
Now, you want to use a prefix map to set up your pitches so sound will play back in a UST file. The purpose of the prefix map is to read the suffix at the end of the voicebank's alias and assign those to the notes that you set it to, regardless if the note itself has a suffix or not.
The way you set up the pitches is completely up to you, there's no set in stone rule. It will change depending on the range/expression of the song- but the straightforward/default approach is to assign the pitches to the range of pitches they're meant to hit. For my tripitch bank, it would be sensible to do this arrangement:
But seeing how spread apart they are, I decided to close them in a bit more so that you can hear more variety. Play around with your UTAU's pitches and see what works best for the song you want to cover!
If you wish to use the UTAU's default voice (an unsuffixed pitch), you add a ? to the beginning of the lyric.
If you wish to use a different/specific pitch on a certain note that is outside of what the prefix.map generally covers, just type the suffix at the end of the lyric.
And here's the quick multipitch bank example I did for this tutorial, I used Denki voice to demonstrate this.
Anyways.. that's all I can think of as far as that goes, it's supposed to be a very basic tutorial for people who wish to get into multipitch voice bank creation/usage, I don't really get into specifics like -how- to record a better strong/whisper pitch or how to improve pitch transitions- but if there was something confusing or unclear, or if there is info I neglected to cover, or any other questions you want to know, let me know!