UTAU Glossary

A list of words commonly mentioned when talking about the UTAU/Vocaloid software and fandom

  1. MystSaphyr
    UTAU Glossary

    Below is a list of words commonly mentioned when talking about the UTAU/Vocaloid software and fandom.

    To find a term, you can use ctrl+f on PC or cmd+f on Mac to search the list.


    aiff- An uncompressed sound file extension type used in Mac UTAU-Synth to build voicebanks.

    - The programmer of UTAU and the maintainer of the UTAU website/blog.

    AppLocale - An alternative way of using Japanese software (eg, UTAU) in an English machine if the East Asian Language packs are not installed. However, AppLocale does not allow hiragana usage in the UTAU software without crashing, so only romaji may be used.

    Arpasing - A style of English recording that uses the ARPAbet phonetic system. Developed by Kanru Hua, creator of Moresampler.

    Audacity - A freeware recording software that might be used to record wav samples for a voicebank or mix UTAU vocals with an instrumental track. It is usable on both PC and Mac computers. It can be downloaded at the Audacity homepage here.

    CV - CV stands for "Consonant-Vowel". It is the traditional recording system of UTAU, being designed for the Japanese language. Sounds consist of either a single V, "vowel," sound (a, i, e, o, u and n included) or "consonant-vowel" (ka, ji, no, etc).

    CVVC - CVVC or CV VC uses CV's (Consonant-Vowels) such as "ka" and "ke"; V's and VV's (Vowels and Vowel-Vowels) such as "a" and "ai" ; and VC's (Vowel-Consonants) such as "ak" and "ek". This is a very popular method for creating reclists in other languages (besides Japanese). The most popular kind of CVVC is English.

    Defoko/Utane Uta - The default voice of UTAU, created from a text-to-speech voice from Aquestalk.

    diphones/diphonics - A diphone is a unit of sound consisting of 2 phonemes (units of pronunciation). Examples would be consonant-vowel combinations such as "ka" or vowel-vowel diphongs such as "ai". Diphonics refers to the style itself, where as diphones refers to the specific sounds.

    dipthong - A dipthong is a 2-vowel combination that is pronounced as a single unit. For example, "ai" is not pronounced as a separate "ah-ee" but as a flowing "I".

    Fanloid/Fanmade - Any Vocaloid-like or UTAU-like fanmade character that either has a voice derived from an existing voicebank, or no actual synth voice at all.

    flags - Flags are an editing feature in UTAU that helps change the tonal quality of notes. Flags include Y and H (clarity of sound), BRE (breathiness), an g (gender: + for lower, - for higher). Click here for a list of flags and their functions.

    genderbent/genderbent - The term for a voice that has been edited to sound like a different gender. While many people change the pitch of songs to either higher or lower to achieve this effect, the correct way to genderbend a Vocaloid or UTAU is to change g flags (gender settings) in the software itself.

    hiragana - One of the "kana" alphabets, or phonetic alphabets, of the Japanese language. Hiragana are the characters typically used in UTAU, especially for voicebanks originating in Japan and VCV voicebanks. Cannot be used in AppLocale.

    JOKAloid - A term used for the UTAU PSS character Donka Fjord. Fjord was an April Fool's day joke. His demo featured a very high-quality voicebank, but the released bank consists only of random noises and comes through UTAU as mechanical gurgling noises. Fjord's genderbend, Donka Dasha, also falls into this category.

    Kasane Teto - A fake Vocaloid that was the second useable in the UTAU software, after Defoko. Teto was an April Fool's joke created on the VIP section of the Japanese image board 2chan. She is not an official Vocaloid, despite many fans who believe she is, since she resembles the Crypton style and has a high-quality voicebank. Her voice source is named for a parody of Doraemon's seiyuu. See also Defoko/Dehuo/Utane Uta

    katakana - One of the "kana" alphabets or phonetic alphabets of the Japanese language. However, it is not typically used in UTAU.

    MIDI -Stands for "Musical Instrument Digital Interface") or .mid files, called MIDI files, consist of software instruments readily recognized by the computer and read by the computer in sequences of notes rather than complex audio files, making them smaller than mp3s and other similar file types, though of more computerized quality. MIDI tracks can be imported into UTAU and appear as a series of notes sung with あ ("ah") by the UTAU voice.

    monophones/monophonics - Phonemes consisting of a single sound. These would be the most basic consonants and vowels that make up all languages.

    mora/moora - A term used to describe how many strings of samples is included in a VCV voicebank. (i.e. kakakikakukeka.wav is 7moora)

    OREMO - An audio recording software developed especially for UTAU. It allows user to record against a tone for samples all on the same note, as well as lists of sounds that make sorting recordings easier. It can be downloaded at its homepage here. System Requirements - WinXP or later, and Japanese Locale.

    phoneme - A phoneme is a base unit of pronunciation that is combined with other phonemes to make up words in language.

    Piapro - A Crypton-sponsored online Vocaloid community. Vocaloid users and artists can upload their songs and artwork for others' enjoyment. It is a hub of many prominent Vocaloid producers. However, posting UTAU works, save for certain well-known voicebanks Teto and Ritsu, is not accepted.

    "P-name" - An honorific, "-P" as added to the end of a name, used for mainly Vocaloid and some UTAU users that designates them as a "producer." It is a polite way of showing one's appreciation and respect for a user's music, art or MMD work. The term was derived from the Idolm@ster games.

    Reaper - Audio editing software. Reaper is professional-level and free for NON-COMMERCIAL experimentation and use after the trial period, much like Winzip is indefinitely free but prompts for purchase. Reaper sports a much higher level of editing power than Audacity. It can be downloaded at the Cockos site here.

    reclist - A reclist (short for "recording list") is a list of syllables to be recorded. Generally, they contain all of the phonetic data for a language (or more than one language). Plain text reclists (with white space separating the samples) can be imported into OREMO for easy recording. In other cases, the user simply reads from the reclist as they record.

    resampler - The tool in UTAU that generates output based on the lyrics, notes and configurations in the UST file. Various resamplers include but are not limited to: resampler, fresamp, moresampler, tn_fnds, bkh01, M4, EFB-GT, w4u

    romaji - The romanized (English alphabet version) of the phonetic Japanese alphabet. Can be used in UTAU; in the case that UTAU is being run through AppLocale, romaji is the ONLY way of using UTAU.

    triphones/triphonics - A sound consisting of three separate, basic phonemes. Used in UTAU to describe the recording methods used by Ritsu, Teto, and several "new" UTAU, however the more correct term for those types of banks is VCV.

    UST - Stands for "UTAU Sequence Text". It is the native filetype extension of UTAU and is the only one that the software reads without requiring import or special editing.

    UTAU - Relates to either the UTAU software or a character avatar and/or voicebank. To be considered an UTAU, a character MUST have a usable voicebank for the UTAU software.

    UTAUloid - Fanmade term, derived from "Vocaloid", for an UTAU character avatar. To be considered an UTAU/UTAUloid, a character MUST have a usable voicebank for the UTAU software. Because UTAUloid is a nonsense word, UTAU characters and voicebanks can also simply be referred to as "UTAU".

    VCV - "Vowel-Consonant-Vowel". A phoneme technique used to record UTAU voicebanks, formerly called "triphones" or "triphonics." By recording strings of syllables and using otos to split them up, one can crossfade vowels together before consonants for sound that flows more naturally.

    VIPPAloid - UTAU created by members of the VIP board on 2ch. Often, but not always, prank-Vocaloid-turned-UTAU, examples including Kasane Teto, Namine Ritsu, and Yokune Ruko. They are sometimes referred to as "VIPPERloids" or "VIPPERs".

    Vocaloid - Yamaha's professional vocal synth software on which UTAU is based. Popular Vocaloids include Hatsune Miku, Kagamine Rin and Len, Gumi/Megpoid, etc. Yamaha does not support or endorse the UTAU software but Crypton has shown support and enthusiasm for it.

    VSQ/VSQX - Stands for Vocaloid SeQuence file. It is the file extension type for Vocaloid2 data. Consists of notes and syllables/lyrics, and can be imported into the UTAU software with the notes and lyrics in place. VSQX is the file used by V3 and onwards.

    wav - An uncompressed sound file extension type used in UTAU to build voicebanks. Sounds must be exported from recording software in .wav format for UTAU to recognize them.

    wavtool - The tool in UTAU that generates wav output based on the resampler.
    ΑKYLAS and kimchi-tan like this.

Recent Updates

  1. Updated glossary!

Recent Reviews

  1. kimchi-tan
    Somewhat outdated but would still be very useful for newcomers.