English UTAU Voice Bank Style Using X-SAMPA?

Discussion in 'UTAU Discussion' started by WilyemAndK, Aug 8, 2017.

  1. WilyemAndK

    WilyemAndK Kore o honyaku suru hitsuyo ga arimasu ka?

    Messages
    29
    Likes Received
    22
    Trophy Points
    8
    Hi there!

    Some of you might remember me for posting the new rec-list type thread. I basically gave up on the project because the recordings were getting tedious. I also NOW KNOW FLUENT ENGLISH!!! Yay! That'll make things a lot easier for me. Anyway, I've recently been thinking about how VOCALOID uses English syllables for their recording process. I was thinking about this a lot. I even went over how VOCALOID uses X-SAMPA. It's both simple and advanced. Basically, I was wondering if there was a way for people to use a VOCALOID rec-list, but in UTAU!
    I know many people have already thought about this, but I think we should explore more. I think that the way it should work is that you record the syllable - or a string of syllables, perhaps (?) - and then alias is under the pronunciation.
    For example: you record something encoded in X-SAMPA (for this example it'll sound like "cheap"), and then you alias it under the word "cheap", or "Cheap", then you could make the UTAU sing "I love Cheap Thrills" simply by typing the words into the notes.
    The goal of this project is to make using UTAU easier for people. Hopefully, we'll even find ways to make the recording simpler while still maintaining the VOCALOID type of feel to it.

    ~~~


    What I'll need:

    • A volunteer who's willing to give some of their time to record the syllables and all that (must have Discord or Skype at least. Also, you don't have to be active all of the time. Private message me if interested)
    • Other people who can help with finding VOCALOID recording lists or other recording techniques
    • Others who think they can help with the project in general, even the tiniest amount. Like helping with X-SAMPA encoding and such.

    ~~~

    I found a thread article that doesn't seem very active, but... I'm just gonna revive it lol
    I also found an English VOCALOID rec-list web page that could be of some use. It's for VOCALOID1 but it's a start, isn't it?

    ~~~

    So what do you guys think? You can ask questions if you want to. I want to hear what you have to say about this.
    Also, once again, you'll have to keep in mind that we might close this project like we did with the last one.
    Thanks for letting us take some of your time!
     
    MANIAGIRLKITTY likes this.
  2.  
  3. MANIAGIRLKITTY

    MANIAGIRLKITTY Ruko's Ruffians Defender of Defoko

    Messages
    92
    Likes Received
    56
    Trophy Points
    33
    I am on a project like this right now I understand it pretty good, I have 3 pages of sounds with x- sampa and i can record them for you . But trying to find a reclist from vocaloid is just impossible i tried this too and you wont find anything exept the information about the reclist ( this is what I found after 3 months doh ) . you will have to make one from scratch buts its quite easy once you get into it . but do know that the list of sounds will be more than utau allows into the program .
     
  4. VocAddict

    VocAddict Voice Within Us Defender of Defoko

    Messages
    124
    Likes Received
    444
    Trophy Points
    68
    Everything here might not be exactly accurate so forgive me if there's anything wrong here.

    Vocaloid uses a word based reclist for their banks but Vocaloid doesn't output exactly what was recorded. Strings that are recorded are broken up into their phonetic units and then reconstructed for whatever lyrics are inputted. Vocaloid just shows us the phoneme selection for the word that was inputted.

    So, let's say "cheap" is inputted as a lyric, Vocaloid would show [tS i: p] but Vocaloid just doesn't have a audio file labelled "cheap" waiting there. It constructs the audio for it from numerous phonetic units that were broken up from the recording script, and then strings it together to form the word.

    Also, a method like this won't really work because of the way UTAU it was built. There's a reason why phonemes are entered separately because it doesn't have that automatic feature like Vocaloid. Yes, it is possible to just record the word "cheap" and play it but it won't really sound good. What if you want to change pitch halfway though the word? How you would you control the utterance of the final consonant?

    The only way I've seen the "Vocaloid" style of inputting lyrics into UTAU is through the use of presamp and the Arpasing Assistant, though, it converts the word into it's phonemes based on a dictionary so there's still a bit of adjusting on the user's side but it makes things a lot easier.

    The Arpasing method by Kanru Hua features a word-based reclist to help recording be more simple and natural compared to the "gibberish" way reclists have been recorded in the past. There have also been X-SAMPA reclists for English in the past though I can't remember any at the moment.

    I hope that was understandable and if there's anything here that I explained wrong or could have been done better, feel free to point it out.
     
    Last edited: Aug 8, 2017
  5. MANIAGIRLKITTY

    MANIAGIRLKITTY Ruko's Ruffians Defender of Defoko

    Messages
    92
    Likes Received
    56
    Trophy Points
    33
    yes you're correct when recording full words they will not work with short notes , i had to record the same word 3 times , short , normal , and long . and it takes up a lot off space but it does sound decent .And pitch goes fine with it
     
  6. heta-tan

    heta-tan Genderless Goon Supporter Defender of Defoko

    Messages
    155
    Likes Received
    331
    Trophy Points
    84
    I have two English lists in X-sampa, Delta's lists also use X-sampa along with LEXYS.

    The problem with having full words as your notes is that you have no control over a lot of factors and it would take up too much space. Vocaloid only shows the full word for user ease. Behind the scenes, Vocaloid is using several chunks like UTAU does.
    You could make a reclist similar to arpasing or just convert the phonemes to x-sampa but it's impossible to get a result like Vocaloid in UTAU when they already do the same thing...
     
  7. 幸兔雪 (Yukito Yuki)

    幸兔雪 (Yukito Yuki) Defoko's Slaves Supporter Defender of Defoko

    Messages
    512
    Likes Received
    1,306
    Trophy Points
    123
    Delta CVVC English is encoded in X-Sampa (for example Teto English is this VB type), also I have seen that AutoCVVC plugin can convert English words to Delta CVVC phonetics tho IDK how that plug in exactly work but I have seen Japanese users showing on Twitter. Also, if I remember right, there was a lyric plugin designed especially for Teto English VB.

    However, Delta English style VB and it's reclist aren't popular in Western UTAU community, it's more popular in Japanese UTAU community where Delta's English is advertised as "easy and lighter way to record English". I must say that it's easier and shorter than VCCV but CC section of Delta's reclist needs a little adjustment/editing, otherwise it's very great.

    And then there's a new method for English named Arpasing. Unlike previous English recording lists, it's recording is real word based. And because the way, it's configured, very little amount of recording is required for complete VB unlike CVVC Delta or VCCV. But downside of Arpasing is that the default list (0.1.0 or 0.2.0) doesn't cover all CV/VC/VV phonetics plus lacks schwa sound and Arpasing VB uses overall more very tiny noted compared to other English methods.
     
    VocAddict and Kiyoteru like this.
  8. WilyemAndK

    WilyemAndK Kore o honyaku suru hitsuyo ga arimasu ka?

    Messages
    29
    Likes Received
    22
    Trophy Points
    8
    Hmm... well would we still be able to record single English syllables and then just alias them like CVVC or VCCV? I mean, it would be similar to those two, obviously, but would you be able to just record most of the single syllable combinations in English and such? I mean, obviously, it wouldn't be the same as VOCALOID.
     
  9. Kiyoteru

    Kiyoteru "Hiyama" Supporter Defender of Defoko

    Messages
    2,024
    Likes Received
    2,677
    Trophy Points
    157
    There's too many of those for this idea to be practical. There's so many possible starting consonants, vowels, and ending consonants, that you cannot approach an English reclist the same way you'd approach a Chinese reclist.

    For example,
    Code:
    skr{p skr{b skr{t skr{d skr{k skr{g
    skr{tS skr{dZ
    skr{f skr{v skr{T skr{D skr{s skr{z skr{S skr{Z
    skr{m skr{n skr{N skr{r\ skr{l
    skr{sp skr{st skr{sk
    skr{pT skr{pTs skr{pt skr{pts skr{ps skr{pst
    etc...
    
    And this is just some of the possible syllables that start with "skr{" ! You're multiplying the length of a normal English CVVC reclist many times over, a length that might not even fit into the OTO limit of PC UTAU. In addition to the fact that UTAU and its resamplers are not built to handle CVC-type syllables, reclisters typically split english syllables into the beginning CV part and ending VC part for practicality. That way, you only record one "skr{" sample, and you will be able to make all the words you need that start with it. Even when people do crazy things like English VCV, the bulk of the reclist is just CV style with VC samples separately recorded.
     
  10. WilyemAndK

    WilyemAndK Kore o honyaku suru hitsuyo ga arimasu ka?

    Messages
    29
    Likes Received
    22
    Trophy Points
    8
    Hmm... Okay then. That crosses EVERYTHING off of the list, then! Well, I've already found somebody willing to voice the UTAU, so I guess I'll just create a normal UTAU. Like a VCCV or something idrk.
    Thanks for the feedback~~
    Wilyem
     

Share This Page