Can someone explain UTAU better, for me?

iBurrito

Momo's Minion
Okay guys, I need a very large amount of help. If I can get it, I'd really appreciate it, but if not, I guess that's okay too. But I've been wanting to begin my own UTAUloid for a while now - even if I've just joined the forum this month. I'd been looking through all sorts of videos where users showed off their UTAUloid skills. I can't read Japanese; in any form that is (if that makes sense). I don't know what hirogana is, or what katakana, romaji, or kanji are. I can't read them either. That makes things very difficult for me when using programs like UTAU and trying to find UTAUloids to use. To add to that, I'm not sure how to use a UTAUloid. My problem isn't that I don't know how to make the UTAUloid 'sing', but my problem, is that I cannot figure out yet how to import UTAUloid voicebanks into UTAU. I've actually went through a few tutorials to get some help with importing voice banks, but there's always a problem. Every voicebank I've downloaded ended up under different file extensions that I didn't understand, and also, although I don't need to read the 'Japanese' symbols to figure out the sounds in the banks, I can't figure out how to fix (not sure if it's called a UST or VSQ) a VSQ. What I mean there is, say, I downloaded a VSQ. Each note has written on it, romaji, rather. The voicebank is in weird Japanese symbols that I do not understand. The UTAUloid will not sing it.

To put everything into short - because I know this is boring and I'm confusing - can someone please explain/re-explain the topics here;
http://utaforum.com/topic/8181272/1/

At first I began to understand what this user meant, and what she was talking about, but then when she continued on, she spoke as though I knew what 'strings' were.

You take a string of four to eight (in the case of Uzne Hito, who was released last month with an 8-mora, or 8-syllable, reclist)

I don't get the concept, as she's speaking as though I already have previously knowledge on it, which doesn't allow me any more to figure out what to do. And to tell you the truth, I still don't understand what a reclist is. So can someone please re-explain this for me? Or at least make it a little simpler to understand? I'm not sure how I'm supposed to write this out without causing tons of confusion. 
 

Raiyux

Ruko's Ruffians
Defender of Defoko
I just saw this, but... are you still confused? Because I think I could help you some, if you still need it.
 

iBurrito

Momo's Minion
Thread starter
Raiyux link said:
I just saw this, but... are you still confused? Because I think I could help you some, if you still need it.

Ah, even though it could easily be found in the Wikia, or just Googled, what's the difference between a VCV and a CV voicebank? Orz Also, the difference between UST's, VSQ's, and MIDI's? I never understood that. ;_;
 

Raiyux

Ruko's Ruffians
Defender of Defoko
iBurrito link said:
Ah, even though it could easily be found in the Wikia, or just Googled, what's the difference between a VCV and a CV voicebank? Orz Also, the difference between UST's, VSQ's, and MIDI's? I never understood that. ;_;

Okay... CV stands for 'Consonant-Vowel' (like 'ka', 'ba', 'sa'). It's just one sample at a time. Basically, when you record CV, every sample is by itself in its own file. This is why CV sounds choppier than VCV.

VCV stands for 'Vowel-Consonant-Vowel'. Unlike CV, you record your samples in strings (5-mora/7-mora refers to how many samples are together in the string). For a VCV recording list, you'll see 'a_ka_sa_ta_na'. This means that you'll record all of this together with no stops or pauses in between. It comes out smooth because you're actually recording the vowels in between. When you do the oto.ini, you'll break 'a_ka_sa_ta_na' up into 'a_ka', 'a_sa', a_ta', 'a_na'. That way you get a little bit of the preceding vowel in the recording.

I recorded this to hopefully clear up my explanation:
[soundcloud]http://soundcloud.com/raiyux/cv-vcv-example[/soundcloud]

As for UST's, VSQ's, and MIDI's... A UST is a file used only by UTAU. In short, it is the program's way of keeping data. The way a Word document saves all the text you type in, UTAU saves the lyrics you put in. It also saves data on which UTAU you were using when you made the UST. Conveniently enough, I have a UST open right now, which I'll print-screen and put as an attachment.

VSQ's do the same thing that UST's do, but are for Vocaloid. If you try to import them into UTAU, you'll get nothing. I believe there is a way to convert them to UST's and vice-versa, but unfortunately I'm not knowledgeable on that subject. Sorry. :sad:

MIDI's are basic audio files that pretty much any kind of audio program can open, but they're not very specific. What I mean by that is... if you exported your UTAU data as a MIDI, then played it back on the computer, you'd get a piano track. It would save the pitch of everything, just not the voice. MIDI's are also good tools for making UST's, though! If you open up a MIDI in UTAU (like I said), all you get is pitch and note-length. So UTAU automatically just puts 'a' for every note. All you have to do then, is change the lyrics.

This is a recording of a MIDI I made from the UST I have open:
[soundcloud]http://soundcloud.com/raiyux/midi-example-worlds-end-dance[/soundcloud]
And this is what the actual UST sounds like when sung by my UTAU:
[soundcloud]http://soundcloud.com/raiyux/ust-example[/soundcloud]

I hope this helps you! Trust me, I was just as overwhelmed as you were when I started using UTAU. It gets easier after you start playing around with it. ^^
 

Attachments

  • UST.jpg
    38.5 KB · Views: 9

theLooneyLibrarian

Teto's Territory
...what Raiyux said.
Exept that you actually can import VSQs into UTAU (there's an import function that let's you import midis and vsqs).
The only problem with that is that while ust have all the nice pitch bends, vibrato, flags (combinations of letters and numbers you can input into the project properties that control breathiness, gender factor and whatnot) and so forth, imported vsqs or midis don't have that, so the result will sound...well, boring, I guess.
(I have no idea how many flags there are and what they all do, so don't bother with them right now.)
Question: Have you gotten the latest english patch for utau yet? If not, you can get it here:http://www.voiceblog.jp/mianaito/1062049.html
Trust me, it makes everything so much easier. Now.
About the japanese symbol thing. Once you've got the patch, all your menues in utau will be in english. The usts and (to a certain extinct) the voicebanks, will still be in japanese, typically in hiragana (that's the kind of japanese characters most voicebanks are coded in). So, if you seriously want to use UTAU, you should learn hiragana. BUT there are voicebanks (and ust's) in romaji (normal roman letters, which you can read), plus japanese voicebanks can be aliased to support both omaji and hiragana (and vice versa).
...That's still not helping, is it? Well, I'm going to make a tutorial during the next few days called "Let's make Teto sing". I'll tell you when I'm done, maybe that'll help. ^^:
 

iBurrito

Momo's Minion
Thread starter
Raiyux link said:
[quote author=iBurrito link=topic=2018.msg15583#msg15583 date=1338084547]
Ah, even though it could easily be found in the Wikia, or just Googled, what's the difference between a VCV and a CV voicebank? Orz Also, the difference between UST's, VSQ's, and MIDI's? I never understood that. ;_;

Okay... CV stands for 'Consonant-Vowel' (like 'ka', 'ba', 'sa'). It's just one sample at a time. Basically, when you record CV, every sample is by itself in its own file. This is why CV sounds choppier than VCV.

VCV stands for 'Vowel-Consonant-Vowel'. Unlike CV, you record your samples in strings (5-mora/7-mora refers to how many samples are together in the string). For a VCV recording list, you'll see 'a_ka_sa_ta_na'. This means that you'll record all of this together with no stops or pauses in between. It comes out smooth because you're actually recording the vowels in between. When you do the oto.ini, you'll break 'a_ka_sa_ta_na' up into 'a_ka', 'a_sa', a_ta', 'a_na'. That way you get a little bit of the preceding vowel in the recording.

I recorded this to hopefully clear up my explanation:
[soundcloud]http://soundcloud.com/raiyux/cv-vcv-example[/soundcloud]

As for UST's, VSQ's, and MIDI's... A UST is a file used only by UTAU. In short, it is the program's way of keeping data. The way a Word document saves all the text you type in, UTAU saves the lyrics you put in. It also saves data on which UTAU you were using when you made the UST. Conveniently enough, I have a UST open right now, which I'll print-screen and put as an attachment.

VSQ's do the same thing that UST's do, but are for Vocaloid. If you try to import them into UTAU, you'll get nothing. I believe there is a way to convert them to UST's and vice-versa, but unfortunately I'm not knowledgeable on that subject. Sorry. :sad:

MIDI's are basic audio files that pretty much any kind of audio program can open, but they're not very specific. What I mean by that is... if you exported your UTAU data as a MIDI, then played it back on the computer, you'd get a piano track. It would save the pitch of everything, just not the voice. MIDI's are also good tools for making UST's, though! If you open up a MIDI in UTAU (like I said), all you get is pitch and note-length. So UTAU automatically just puts 'a' for every note. All you have to do then, is change the lyrics.

This is a recording of a MIDI I made from the UST I have open:
[soundcloud]http://soundcloud.com/raiyux/midi-example-worlds-end-dance[/soundcloud]
And this is what the actual UST sounds like when sung by my UTAU:
[soundcloud]http://soundcloud.com/raiyux/ust-example[/soundcloud]

I hope this helps you! Trust me, I was just as overwhelmed as you were when I started using UTAU. It gets easier after you start playing around with it. ^^
[/quote]

I was actually wondering what Hachi Makune's voicebank had so many sounds together. I couldn't figure out how to use it and I got a little confused.

One last question (lol, I ask too many questions), from your screen shot, I can tell that your voicebank is in Japanese (I suppose) and after playing around with my keyboard - after converting my locale to Japanese - I would like to know how you knew you had the right.. uh.. 'symbols' (I'm not sure what they're called xD) for each sound?

Orz Like, so that your voicebank would work with pre-made UST's. D: GAH IDK. I'M NEW TO THIS .
 

Raiyux

Ruko's Ruffians
Defender of Defoko
iBurrito link said:
One last question (lol, I ask too many questions), from your screen shot, I can tell that your voicebank is in Japanese (I suppose) and after playing around with my keyboard - after converting my locale to Japanese - I would like to know how you knew you had the right.. uh.. 'symbols' (I'm not sure what they're called xD) for each sound?

Orz Like, so that your voicebank would work with pre-made UST's. D: GAH IDK. I'M NEW TO THIS .

I only have a vague idea of what you're asking, but I'll try to answer it the best I can.

Those 'symbols' are called hiragana. While it helps to know hiragana, you can very easily record a voicebank without needing to know them all. I didn't know hiragana for my first UTAU. :wink:

Just to remind you, romaji is what we call romanized Japanese. It's just Japanese spelled with plain old letters. If you find a recording list in romaji, you can use those to record your UTAU. While there are UST's in romaji, most of them will be in hiragana. BUT...You can make it so that UTAU automatically associates your romaji-named files with the correct hiragana 'symbol'.

First, you just select the UTAU you want to use in UTAU. Then, you go to 'Tools' at the top of the window. From Tools, you go to 'Voicebank Settings'. When you click this, it'll open up a list of your recordings with their oto.ini configurations. Now...on the right side of this new window, there's a bunch of boxes where stuff can be typed in. In the top box, you'll see the file name. Underneath is a box that says (if UTAU is in English) 'Alias'. All you have to do is switch your keyboard to Japanese, choose to type in hiragana, and then input the hiragana for each sample into the 'Alias' box. Once you do this, your romaji UTAU will be able to read hiragana. You can also do the vice-versa and record in hiragana and alias with romaji. I'm leaving you another screenshot.

You know how to use your keyboard in Japanese, right? In case you don't... (and Idk what operating system you have so it might not be the same for you, but I use Vista) You just click the 'EN' down on the taskbar and then click Japanese (You might be able to just alt+shift to change the languages, depending on how you have it set). Then after that, you click the weird looking 'A' next to what now says 'JP', which opens up a drop-down box. From the list, you pick 'ひらがな (H)', for hiragana. Once you've done this, anything you type will turn to hiragana. Pressing 'Enter' finalizes what you typed.

Fun fact: I actually learned to read hiragana from using UTAU! Surprisingly, I think it's actually a good learning tool for that.

Anyway...I hope I answered your question. If you have any more questions, you can ask me anytime. :3
 

Attachments

  • alias.jpg
    38.5 KB · Views: 7

iBurrito

Momo's Minion
Thread starter
Raiyux link said:
[quote author=iBurrito link=topic=2018.msg16007#msg16007 date=1338331092]
One last question (lol, I ask too many questions), from your screen shot, I can tell that your voicebank is in Japanese (I suppose) and after playing around with my keyboard - after converting my locale to Japanese - I would like to know how you knew you had the right.. uh.. 'symbols' (I'm not sure what they're called xD) for each sound?

Orz Like, so that your voicebank would work with pre-made UST's. D: GAH IDK. I'M NEW TO THIS .

I only have a vague idea of what you're asking, but I'll try to answer it the best I can.

Those 'symbols' are called hiragana. While it helps to know hiragana, you can very easily record a voicebank without needing to know them all. I didn't know hiragana for my first UTAU. :wink:

Just to remind you, romaji is what we call romanized Japanese. It's just Japanese spelled with plain old letters. If you find a recording list in romaji, you can use those to record your UTAU. While there are UST's in romaji, most of them will be in hiragana. BUT...You can make it so that UTAU automatically associates your romaji-named files with the correct hiragana 'symbol'.

First, you just select the UTAU you want to use in UTAU. Then, you go to 'Tools' at the top of the window. From Tools, you go to 'Voicebank Settings'. When you click this, it'll open up a list of your recordings with their oto.ini configurations. Now...on the right side of this new window, there's a bunch of boxes where stuff can be typed in. In the top box, you'll see the file name. Underneath is a box that says (if UTAU is in English) 'Alias'. All you have to do is switch your keyboard to Japanese, choose to type in hiragana, and then input the hiragana for each sample into the 'Alias' box. Once you do this, your romaji UTAU will be able to read hiragana. You can also do the vice-versa and record in hiragana and alias with romaji. I'm leaving you another screenshot.

You know how to use your keyboard in Japanese, right? In case you don't... (and Idk what operating system you have so it might not be the same for you, but I use Vista) You just click the 'EN' down on the taskbar and then click Japanese (You might be able to just alt+shift to change the languages, depending on how you have it set). Then after that, you click the weird looking 'A' next to what now says 'JP', which opens up a drop-down box. From the list, you pick 'ひらがな (H)', for hiragana. Once you've done this, anything you type will turn to hiragana. Pressing 'Enter' finalizes what you typed.

Fun fact: I actually learned to read hiragana from using UTAU! Surprisingly, I think it's actually a good learning tool for that.

Anyway...I hope I answered your question. If you have any more questions, you can ask me anytime. :3
[/quote]

Thank you for your help! ;D
Orz all I need to do now is create the voicebank. ^^
 

Similar threads