Thoughts on Ameya's possible UTAU update tweets?

Discussion in 'UTAU Discussion' started by LilyoftheValley, Dec 23, 2018.

  1. LilyoftheValley

    LilyoftheValley Your local flower girl Defender of Defoko

    Messages
    154
    Likes Received
    320
    Trophy Points
    77
    I haven't seen anyone make a thread on this yet, if there is one throw it down below.

    Our lord and savior has returned with tweets showing an update/translation to UTAU. It seems to be just a German translation, but it has new features? Anyone understand these tweets better? What is actually being done? Is it an actual update? Thoughts? Speculations?
    Discuss and stuff

    https://twitter.com/ameyaP_

    [​IMG] [​IMG] [​IMG] [​IMG]
     
    sangv, Pixys, flutteraliviaNL and 4 others like this.
  2.  
  3. Avalia-Kasa

    Avalia-Kasa probably a potato tbh Supporter Defender of Defoko

    Messages
    334
    Likes Received
    246
    Trophy Points
    61
    i'm so excited for this update you have no idea!!! im VERY sure its an update coming soon

    things i think will be features:
    • shareware UTAU (probably not freeware but i might be wrong) will have the folder alias - from what i can tell, if u put a marker of the folder alias in (i'm guessing # is the trigger) will change the subfolder the voicebank is using for easier multiexpression
    • POSSIBILITY of end consonant oto configuration being added which is!! SO AMAZING!! FOR NON-CV LANGUAGES!!!! AAA?? (though this isn't explicit and we can only speculate based off the cropping of the pictures)
    • you can have subfolders inside of subfolders now so instead of having banks with (Soft-A3), (A3), and (Power-A3) you can have (Default) (Soft) and (Power) folders with the pitches inside those!!
    • judging by the sheer volume of features from this update i'm willing to bet there will be either a change to the oto limit or subfolders will be a way of getting around it... but dont quote me on this one
    im so hyped!! you have no idea how excited i am aaaaaAAAAA??
     
    Mitt64, sangv, UnclePeter and 6 others like this.
  4. 幸兔雪 (Yukito Yuki)

    幸兔雪 (Yukito Yuki) Pronouns: They/Them Supporter Defender of Defoko

    Messages
    771
    Likes Received
    2,051
    Trophy Points
    123
    I’m excited to see a possible update too! It has been literally 5 years for a last update.

    (I hope that UTAU-Synth will get eventually an updated too)
     
  5. Raindropx

    Raindropx Teto's Territory Defender of Defoko

    Messages
    44
    Likes Received
    88
    Trophy Points
    28
    WOW! I'm glad to know he's still alive.


    actually
    I want ogg/aac support more than mp3 voicebank support, they are better lossy compressed audio formats than mp3
    Although it looks like this is still vb6
     
  6. Ivy!

    Ivy! Teto's Territory

    Messages
    35
    Likes Received
    28
    Trophy Points
    28
    I'm just happy we're getting content
     
  7. Kiyoteru

    Kiyoteru Local Sensei Supporter Defender of Defoko

    Messages
    2,767
    Likes Received
    3,659
    Trophy Points
    157
    Since all of the features in freeware still exist in shareware, we don't actually know for sure which version the update will be for. The screenshots feature the shareware version, yes, but it would probably be concerning if a software developer didn't have full access to every feature they were still working on.
     
  8. Avalia-Kasa

    Avalia-Kasa probably a potato tbh Supporter Defender of Defoko

    Messages
    334
    Likes Received
    246
    Trophy Points
    61
    i'm basing my assumption on this screenshot
    [​IMG]
    while it may still be freeware that will be able to use markers for changing the expression, i think the use of auto vcv is an interesting choice of screenshot and... tbh? there's not much reason to get shareware utau since vcv plugins exist already, so it might be a good incentive to get it if ameya does inded decide to put it on shareware only o:
    after all, freeware still has the suffix broker
     
  9. Avalia-Kasa

    Avalia-Kasa probably a potato tbh Supporter Defender of Defoko

    Messages
    334
    Likes Received
    246
    Trophy Points
    61
    ok rollback? i guess?

    thing i posted last time: strength feature for how much the dynamics change with vibrato (which i'm hoping will also mean more dynamic support than just envelopes)
    [​IMG]

    new thing?? oooo crossfade optimization??? tell me more ameya >.>
    [​IMG]
     
    LilyoftheValley and Kiyoteru like this.
  10. Soursop the fruit

    Soursop the fruit ✧ Fruity & Happy ✧ Defender of Defoko

    Messages
    261
    Likes Received
    537
    Trophy Points
    113
    UnclePeter likes this.
  11. Kiyoteru

    Kiyoteru Local Sensei Supporter Defender of Defoko

    Messages
    2,767
    Likes Received
    3,659
    Trophy Points
    157
    I'm very intrigued by this feature and I'm hoping that there's other people I can discuss this in-depth with. If you don't mind, I'll be copying some of the thoughts I've already shared elsewhere, with some edits to clarify the thoughts and make them more legible.

    [hr]

    Why not yellow though?
    (friend) I love that mint green for second consonant.
    I disagree, it should be yellow. Every parameter should have its own color. Overlap and endcons could get confused easily.

    (friend) maybe it's good for Chinese?
    Complete chinese syllables in vocalsynth (cvvchinese, vocaloid phonemes, etc) end up with really cyva-ish pronunciation (ie. overpronounced)

    Honestly though, I'm not sure how this fits into our current knowledge of OTO and reclist theory.
    That just takes away the biggest advantage of CVVC, which is that you're able to mix and match CVs and VCs.
    A voicebank with:
    ka ak ta at pa ap (CV and VC samples)
    VS:
    kak kat kap tak tat tap pak pat pap (complete CVC samples)
    There's a big difference in the efficiency of the approach

    With this you need a complete CVC syllable, unless people appropriate it for VC otos. I'm just not sure how yet.
    If you were to control the timing of a diphthong ending in a consonant, and you can use endcons to ensure that the consonant is unstretched while affecting the length of the vowel portion
    but in envelope terms that implies something like "increase preutterance and lower STP"???
    the preutterance parameter is the one that determines the difference between the actual start of the audio for that note, and the notated beginning in the score

    [​IMG]

    Here's my theory for the envelope. I'm curious about the danger zone, would this have to be considered as an intentional feature? I mean, it wouldn't be the overlap if it the notes didn't overlap at that point, but in this case you'd be losing some of the consonant that you're trying to preserve with endcons.

    One way that this could be used would be like a reverse VCV, where instead of using vowels as the blending point, it's consonants. Like a CVVC japanese voicebank except grouped as CVCs. Sure, it's inefficient, but VCV is inefficient too
    ex. consonants are k t p g d b and vowels are a i u. Therefore a reclist with every CVC would look like
    kakatakapakag tatapatagatad papagapadapab gagadagabagak dadabadakadat babakabatakap kikitikipikig titipitigitid pipigipidipib gigidigibigik didibidikidit bibikibitikip kukutukupukug tutuputugutud pupugupudupub gugudugubuguk dudubudukudut bubukubutukup
    To OTO "kakatakapakag" you would split it into these units: [kak][kat][tak][kap][pak][kag]
    Avcv list using the same phonemes (k t p g d b / a i u) would look like this:
    kakakika kikikuki kukukaku tatatita titituti tututatu papapipa pipipupi pupupapu gagagiga gigigugi gugugagu dadadida dididudi dududadu bababiba bibibubi bububabu
    And then of course "kakakika" splits into [- ka][a ka][a ki][i ka]

    Just in terms of Japanese, imagine a voicebank where the input is like this: [- か][a えr][るn][の][o うt][たg][が-]
    Sure, you could just use VCV, but then you'd need a separate [a が] and [a -] sample. So this particular approach to CVC samples is to maximize the context.

    Of course, all of this is completely pointless if the OTO limit hasn't gone up from 2^15 lines.
     
    Last edited: Jan 9, 2019
  12. Sylveranty

    Sylveranty Ruko's Ruffians Defender of Defoko

    Messages
    151
    Likes Received
    105
    Trophy Points
    58
    This is quite an interesting feature to be incorporated, but I also wondered how it should be properly used.

    From how I understand it currently, the pink field basically tells the resamplers to not strecht/loop this part, the white part gets stretched/looped, and the new green part would then be incorporated into the note when the note is nearing its end without being stretched or looped.
    And as Kiyoteru has said, that would mean you'd need a complete CVC syllable to use this feature. Every recording would be kinda closed in itself, starting and ending in what the recording comes with. If it could be specified that "this is the ending [ot]" and this ending [ot] would always be played when the resampler reads it in any combination of text in the note, e.g. "rot" automatically taking [ro] and [ot] out of their recordings, that would probably be pretty neat.

    If however this indeed means you need to have every possible C(CC)V with every possible VC(CC) of the language, that'd be a monster of a voicebank.
    German has up to 15 single vowels, not counting diphthongs and counting schwa and a-schwa as their own phonems.
    There are around 24 consonants that can stand in the starting position of a syllable, and around 13 that stand in the coda, consonant clusters aren't even considered yet. You easily end up with at least 4000 lines or recordings, even if you scratch the a-schwa. And again, that is without consonant clusters. Adding CCV to that, the number of recording at least doubles. This still excludes ending clusters, though I'd put those together out of [VC]+[CC] either way. Having 8000 to 10.000 recordings for one pitch is quite overwhelming and I wouldn't be a fan of it.

    My numbers can of course be totally wrong, I'm never sure with the logic I apply to maths. I'd love to hear more about this feature and how it works in detail, so that the community can start to figure out how it can be applied or how we have to change our thinking to cleverly use it.
     
  13. Dangosan

    Dangosan Jerboa angel of Light Defender of Defoko

    Messages
    257
    Likes Received
    217
    Trophy Points
    72
    A correct use for that would be VCCV functionality for CVVC voicebanks.
     
    UnclePeter and Soursop the fruit like this.
  14. Soursop the fruit

    Soursop the fruit ✧ Fruity & Happy ✧ Defender of Defoko

    Messages
    261
    Likes Received
    537
    Trophy Points
    113
    This feature can be used as extra in english bank to quickly create frequently used CVC (or one syllable word) words and reduce copy-pasting notes/plugin usage/making the word from zero. User can create note, set length and insert the word(ex:[through] [heart] [will])

    I'm curious what's the "Endkonsonant" button for though, i hope Ameya will tell us about it.
     
    Last edited: Jan 9, 2019
  15. Pokefan2012

    Pokefan2012 Ritsu's Renegades Defender of Defoko

    Messages
    98
    Likes Received
    265
    Trophy Points
    72
    Sorry this ended up so long, I just wanted to give some of my own insight and speculation into the CVC otoing options Ameya's working on as I've always wanted VC otoing and it being CVC makes it all the more interesting. Basically, the tl;dr is that I think it's very cool, but, perhaps, a bit gimmicky and I see more uses for it in the context of VC sounds than CVC sounds.
    ---
    I feel like how useful the CVC otoing will be is going to depend on your intention. For example, if you've created a 2-mora CVVC VB ('がが' 'けけ' etc.) you could probably use the recordings to make ending CVs ('が -' 'け -' etc.) and omit the typical end breaths like 'a -' without having to do anything too crazy with your otoing (this would also work with CV now that I think about it) since you could still easily blend 'け' 'e g' 'が -' together.

    On the other hand, if you've created a VCV or higher mora CVVC bank, creating ending consonants like this would be highly inefficient because you'd have to record so many extra sounds (although, you could probably make it work if you wanted to add 'v c' otos to your VCV bank, alternatively 2-mora VCV would work if you really want) so you're probably better off just sticking with the standard 'a -' ending breaths instead. That said, being able to give the ending breaths a proper VC oto may help prevent breaths getting stretched or distorted in UTAU, so it will still be useful.

    Of course this is just in the context of Japanese and currently existing methods which, of course, don't account for a feature that was completely non-existent initially, and maybe we'll start seeing things like this
    which I'm actually pretty excited to think about and could provide some quality results, but this is, of course, all speculation, and only time and testing can say where this'll lead.

    In the context of languages like English or German, I think the option for CVC otos could be incredibly useful, especially with regards to otoing the ending sounds (stuff like 'est' 'eks') though I'm not sure how a full 'CVC' otoed sound would be utilised, though, granted, I have considerably less experience with non-Japanese vbs. I feel like @Sylveranty pretty much sums up my thoughts for these languages as well regarding CVC otos:

    Regarding languages such as Spanish and Korean, however, I think CVC otos could be pretty revolutionary (being able to oto sounds like 'geul' for example) because they are, to my understanding, a lot less taxing than languages like English or German regarding potential CVC sounds. Of course, this could still run the risk of being inefficient, but singling out the most common CVC sounds and still combining CV with VC for the remaining sounds could result in a pretty nifty voicebank (you could probably make this argument for languages like English or German too, but I could still see that being immensely taxing in comparison).

    I also like Soursop's idea
    I already see people adding things like numbers to their VBs, and I think creating a reclist of the most common words used in songs to go alongside your standard reclist (whatever your preference) could be a really neat use for CVC otoing and speed up the editing process (inb4 someone records a whole dictionary for the meme value). Of course we'll have to wait to see how well something like this actually works in practice.
    ---
    Edit: Tidied things up with spoilers
     
  16. sangv

    sangv Teto's Territory

    Messages
    13
    Likes Received
    27
    Trophy Points
    17
    That's a pretty good idea. Would anyone actually be interested in having a list of commonly used single-syllable words that they can pick and choose from to record if the CVC feature was released? Because if so, I might attempt to compile a list like that myself somehow, by maybe going through the lyrics from some songs in the Billboard top 100 from 2000 to 2018. Kind of sounds a bit daunting, but I guess it'd be worth a try, since I can't really find many good lists of frequently used words in songs out there at the moment.
     
    Soursop the fruit and Kiyoteru like this.
  17. bio

    bio VocalSynth Enthusiast Supporter Defender of Defoko

    Messages
    368
    Likes Received
    450
    Trophy Points
    78
  18. bio

    bio VocalSynth Enthusiast Supporter Defender of Defoko

    Messages
    368
    Likes Received
    450
    Trophy Points
    78

Share This Page