• If you do not recieve your confirmation email within a few hours, please email haloutau@gmail.com with your username for manual validation. Your account should be activated within 24 hours.
    You may also reach out via any other listed contact on Admin Halo's about page: https://utaforum.net/members/halo.194/#about
Kierteiyu
Reaction score
234

Joined
Last seen

Profile posts Latest activity Postings Resources Media About

  • I'm about to upload Suzuki Hoshi V5's demo (ERROR, which will be used for the append demos too)
    it's not mixed (no inst), and I tuned only part of it, but hopefully it sounds ok
    MWAH! お願い君がほしの
    Kierteiyu
    Kierteiyu
    I don't speak chatgpt and I don't think google's translation was correct
    Kierteiyu
    Kierteiyu
    Why did I say chatgpt instead of japanese
    Natsucchii
    Natsucchii
    This is monitoring by deco27 btw
    Should've wrote /lyr ig
    I beat utaforum
    1751845954123.png
    • Like
    Reactions: SaKe
    SaKe
    SaKe
    Well, we’ve had a good run. You’ve finally beaten Utaforum. Mods, shut the forum down.
    Kierteiyu
    Kierteiyu
    We don't need to shut the forum down. There's enough replayability.
    • Haha
    Reactions: SaKe
    SaKe
    SaKe
    Don't forget the endgame content.
    I thought of a somewhat reasonable idea for how CVCV could be used.
    Say if someone wanted to make a port of their AI voicebank into utau. CVCV could be used to make sure it remains as realistic as possible.
    • Like
    Reactions: Kierteiyu
    Kierteiyu
    Kierteiyu
    I mean, if they had an AI voicebank, they wouldn't care about realism when porting to Utau. Also the resulting bank would be too many gigabytes.
    I'm making an RVC-able CVCV Lite voicebank using Google's TTS service and a batch renderer I made. Is there a base oto? I don't want to type in all those aliases...
    G
    GreenPear03
    You don't appear to have to type aliases, however you should be fine having UTAU generate a blank OTO file, and then OTOing from there. I also think it would be nice to have VC sounds for the 'n', such as 'a n' and 'e n'.
    Kierteiyu
    Kierteiyu
    There's no base oto sadly. You just have to suffer.

    As for the a_n and e_n thing I can modify my python script and update that quickly.
    Kierteiyu
    Kierteiyu
    Posted the update to the reclist. It renames samples like an and en to a_n and e_n. Also adds them to the lite reclist.
    I made a Japanese CVCV reclist: https://mega.nz/folder/KVNjxIZD#yKB4wCiVXtN9QtYrH-dTsQ

    Lite has CV, and CVCV with a single N sample
    Standard has CV, CVCV, CVNCV, CVCVN, and CVNCVN samples.

    Lite is 8196 samples long
    Standard is 27095 samples long

    Ignore the borked versions. They're borked. Probably also ignore the not borked versions. They're still cursed.

    A CVCVCV update is coming
    Kierteiyu
    Kierteiyu
    Lite would sound horrible though. To synthesize kantan it'd be ka + n + ta + n, meaning nasal sounds force you to break the samples apart.

    Standard does it with one sample, but standard is insane. Also I think non borked still is borked because it's kata instead of ka_ta. I'm rewriting my generation script for CVCVCV, so I'll quickly remake CVCV
    G
    GreenPear03
    Okay! You can also use VC sounds for the N sound, however it defeats the purpose of using CVCV as opposed to CVVC or VCV.
    Kierteiyu
    Kierteiyu
    I updated it with the script, a readme, Kieths_JP_CVCV_lite, and Kieths_JP_CVCV_standard.

    That's the final reclist and you shouldn't use it.
    I'm making an English reclist called "GenAm sampaList"

    It uses transitions between every phoneme. For the word "five" the samples are [f_a a_I I_v]

    It has like 500 samples (two phonemes per sample) in the uncut version, so I need to fix that.
    • Like
    Reactions: SaKe
    Natsucchii
    Kierteiyu
    Kierteiyu
    I quickly cut the list down to 563. I will cut it down more tomorrow.

    Also, the official Arpasing list is 220 lines long and each line is 5-10 phonemes long. I have 563 lines but with 2 (pronounced with 2-3) phonemes each.
    Kierteiyu
    Kierteiyu
    So actually a pretty similar length
    idc what anyone says, i'm using 3 pitches from my cvvc as data for my diffsinger. If it turns out as shit, so be it.
    • Like
    Reactions: Kierteiyu
    Kierteiyu
    Kierteiyu
    Also you might get overfitting (where the model memorizes the data). This shows as validation loss being much much higher than training loss.
    Kierteiyu
    Kierteiyu
    If that happens scale down the model so it doesn't have enough parameters to verbatim memorize it.
    Natsucchii
    Natsucchii
    i think it'll be fine but the Colab stuff is gonna have to wait for a bit
    Screenshot because of character limit

    1751077729643.png
    G
    GreenPear03
    This was me for a while when I was creating a vocal synthesizer in Javascript, and switched between Web Audio API forks, before realizing that I would be better off just using the Web Audio API, which I first thought of using. Maybe you'll go back to one of these libraries.
    I was thinking about making an English cover for PinocchioP's latest song, but I don't know how to do that.
    Eleanor Forte is the best voicebank ever. She's not realistic but she sounds amazing
    • Like
    Reactions: Mr. Cloud
    Mr. Cloud
    Mr. Cloud
    H u h

    I never said that- ;-;
    SunnyWolves
    SunnyWolves
    I know, just exaggerating for a joke. Don't mind me.
    Kierteiyu
    Kierteiyu
    I love her deeper tone, especially with the AI bank. Her voice is also just a little bit mechanical which I like compared to ultra realistic.
  • Loading…
  • Loading…
  • Loading…
  • Loading…
  • Loading…