In-Depth Guide to Multipitch Voicebanks

Recording, configuring, and using multipitch voicebanks in UTAU.

  1. Pratzelwurm
    Multipitch is something many users seem to struggle with, but it is also something that I have done a lot of research on, so I've elected to make a written guide on creating and using multipitch voicebanks. If there are any terms you want defined that aren't in the glossary, let me know.

    Finding Your Range
    This is arguably the section that is the most difficult. In order to optimize your own multipitch bank, you should focus on familiarizing yourself with your own vocal range, namely where it shifts in tone. Typically, a multipitch voicebank will include at least three pitches: one for the chest voice^1, one for the head voice^2, and one for the throat voice^3.

    To select the best pitches for your own voicebank, first establish your comfortable singing range -- that is, the full range you can sing without any vocal strain. Most UTAU users do not have vocal training, so it is very important that you do not attempt to record at a pitch which causes discomfort, as doing so can actually cause damage to your vocal chords. For example, I can sing from C3 to C5, but my comfortable range lies from E3 to A#4.

    In general, people fall into one of these categories for professional singing:
    • Bass (E2 - E4)
    • Baritone (G2 - G4) (most men)
    • Tenor / Contralto (C3 - C5) (this is what I am)
    • Alto / Countertenor (E3 - E5)
    • Mezzo-soprano (G3 - G5) (most women)
    • Soprano (C4 - C6)
    If you lack professional training, your range will most likely be smaller than these, and your comfortable range will be smaller still. If you are uncertain, use a piano or digital tuning fork to sing as high and as low as you can without strain, then take note of what those pitches are.

    Selecting Pitches
    There are two main things to avoid when selecting pitches: extreme jumps and redundancy. What I mean is that you want to avoid shifts in pitch that are too wide and will not transition naturally in UTAU, but you also want to avoid recording two pitches that are too similar, as that creates more work for you, the users, and can actually result in your UTAU sounding less natural.

    My advice is this: find an octave^4 that you feel best shows off your voice. For my normal singing voice, I use the octave on a G, with G3 showing off my chest voice, and G4 showing off my head voice. Some voicebanks will go an octave and a half, or even two octaves; it just depends on what you as a singer feel comfortable recording, and also what you feel is nessisary for the voicebank. Again, we want to avoid redundancy, so if your voice sounds almost the same after a certain point, there's not much use in recording another pitch.

    Next, deside how many intervals between the two notes you want to cover. Typically, the largest jump you should have between notes is 6 halfsteps^5, as jumps larger than that tend to be unnatural sounding, so a one-octave multipitch should have at least three pitches. Now, a 6 halfstep jump may be okay, but smaller jumps of 3 or 4 are usually better; it just depends on how much effort you want to put into it. That being said, everyone's voice is different, so it's best to experiment to find out what works best for you. Personally, I like to have all of my transitional pitches the same distance apart over an octave, but, again, different strokes for different folks.

    Here is a list of pitch outlines that you can try. If you want to go more than one octave, just expand the pattern higher or lower.
    Three pitches (6 halfsteps)
    C F# C
    C# G C#
    D G# D
    D# A D#
    E A# E
    F B F
    F# C# F#
    G C# G
    G# D G#
    A D# A
    A# E A#
    B F B

    Four pitches (4 halfsteps)
    C E G# C
    C# F A C#
    D F# A# D
    D# G B D#
    E G# C E
    F A C# F
    F# A# D F#
    G B D# G
    G# C E G#
    A C# F A
    A# D F# A#
    B D# G B

    Five pitches (2 halfsteps)
    C D# F# A C
    C# E G A# C#
    D F G# B D
    D# F# A C D#
    E G A# C# E
    F G# B D F
    F# A C D# F#
    G A# C# E G
    G# B D F G#
    A C D# F# A
    A# C# E G A#
    B D F G# B

    If you're not sure about your pitches, and you want to test them before recording a full voicebank, I suggest recording a quick bank consisting only a few samples (I generally just record the vowels), otoing that, and testing it in the software to make sure it has the sound you want.

    Voicebank Setup
    This part is pretty simple. Create one main folder for your voicebank; this is what I refer to as the "directory," but that's not an official term or anything. The directory should include the voicebank's icon.bmp, sample.wav, character.txt, readme.txt, and files. Inside the directory, create one subfolder for each pitch. For instance, if you've recorded five pitches on an A3, you will have five subfolders named "A3" "C4" "D#4" "F#4" and "A4". In each subfolder, store all of the samples of just that pitch, including that pitch's individual oto.ini. If you want, you can add a suffix to each of your .wav files with the pitch name as well (ex. "かC4.wav"); this is common, but not nessisary, as separating the pitches into separate folders will differenciate them already.

    Basically, by using subfolders, you are essentially combining what would normally be multiple voicebanks into just one, allowing you to use all of them in UTAU at the same time. This technique can also be used for combining different appends of the same character, provided it is configured properly (see next section).

    The most important step in creating a multipitch voicebank is aliasing. If you do not alias your samples properly, the voicebank will not work! While otoing, ensure that each separate configuration has a suffix on the alias indicating which note it was recorded on. The sample "かかきかくかんかC4.wav" should be oto'd as "- かC4" "a かC4" "a きC4" etc. For voicebanks encoded in roman characters, you can use an underscore to better separate the suffix from the alias, i.e "ka_C4" instead of "kaC4".

    If you have different appends in the same voicebank, it is a good idea to add something to the suffix to indicate which append it belongs to. For instance, if you have a core tone and a powerful tone, you might want to add a "p" to the power tone's aliases, like "かC4p", so the user knows which tone they're using.

    Using Multipitch Voicebanks
    There are two options for using multipitch voicebanks in UTAU: you can manually add a suffix to each lyric in the UST, either by typing it in or using a plugin, or you can set a prefix map to do so automatically. Prefix maps basically tell UTAU to read the lyric ending in a specific suffix depending on what note it is on. In UTAU, navigate to Tools > Edit In this window, shift+select all of the notes you want to read from a specific pitch, type in that pitch in the "Suffix" section, then click set. Once you've done this for all pitches, click OK to save your changes.

    If you are using a prefix map but want to manually edit a specific note, you can add the prefix "?" before the lyric and the proper suffix afterwards, i.e. "?かC4".

    1^Chest voice - the lowest register of one's voice.
    2^Head voice - the highest register of one's voice.
    3^Throat voice - the transitional phase between chest voice and head voice.
    4^Octave - a series of eight notes occupying the interval between (and including) two notes, one having twice or half the frequency of vibration of the other (Google).
    5^Halfstep - the smallest interval used in classical Western music, equal to a twelfth of an octave or half a tone (Google).

    I hope this was helpful! If you have any questions, don't hesitate to ask, and I will answer to the best of my ability.