Release of experimental resampler "young3"

Discussion in 'Software & Plugins' started by Zany, Feb 4, 2019.

  1. Zany

    Zany Teto's Territory Defender of Defoko

    Messages
    16
    Likes Received
    35
    Trophy Points
    18
    I have decided to release the experimental resampler I made. I call it "young3".
    It is basically tn_fnds but using a different f0 estimation algorithm called "HARVEST".

    Feel free to download the "young3" resampler here: http://www.mediafire.com/file/zk90gdt3s87g8kt

    I have tested "young3" with only a few voice banks, but in general, it is similar to tn_fnds but gives a slightly more solid tone.

    Before using "young3", you should know that this resampler takes around 3-5 times longer to render than the original tn_fnds.
    The resampler is also highly experimental which means it will not produce a nice output everytime.
    However, I will introduce some flags below which may help with the quality.

    Flags:
    Since "young3" is basically tn_fnds modified, it accepts all tn_fnds flags.
    One notable flag is the e flag which changes looping to stretching.

    I have added a few new flags. Note that the flags are case sensitive.
    The following flags are additional flags that can be used:
    · M flag
    Use this flag globally. E.g. Project --> Project Property --> Rendering Options
    This will make the resampler render faster provided that you have specified the pitch of the files in multipitch voicebanks.

    This flag was inspired by moresampler's analysis-f0-range-from-path voicebank configuration.
    It automatically allocates f0 estimation limits based on stated pitch of multipitch voice banks.
    This flag will only work if you have recorded samples in the correct pitch and that the files, where the samples are located at, are named correctly, F4, A5, etc.

    There are a few workarounds for monopitch banks. Either place the files in another folder labeled with the appropriate pitch or use p flag that will be introduced later.

    · D flag
    This flag forces the resampler to revert back to tn_fnds's method of f0 estimation.
    This will also cause the resampler to render faster.
    Advisable to use it on any specific notes that sound bad.
    Highly recommended to use the D flag on end breaths.

    By placing the D flag globally, the resampler will become tn_fnds.
    Also something to take note of is that this flag is different from the usual D flag in the default resampler which is a mid frequencies filter.

    · S flag
    Sometimes sibilance or "S" sounds may sound harsh or buzzy. Using this flag will reduce this.
    Use this flag globally as well.

    This is done by automatically allocating D flag to those notes which have "S" sounds.
    This flag will only work on hiragana voice banks.

    I haven't actually updated the search algorithm of this flag properly so it may not work sometimes.
    Especially if your UTAU's name starts with "さしすせそ"

    · p flag (accepts numbers after flag from 65~587 default 262)
    Recommended to use this flag only on individual notes.
    This is the manual version of the M flag.
    Use this website: http://peabody.sapp.org/class/st2/lab/notehz/
    and type the number of the pitch you've recorded the samples at.
    e.g. if you recorded at E4 use flag p330
    This flag overwrites the M flag. Only use it if you know the pitch.

    For monopitch banks, you can use this flag globally as a workaround to the M flag.

    Credits to Kanru Hua since I obtained the website link from the moresampler website.

    Possible FAQs
    Question: How do I use "young3"?
    Answer:
    "young3" is used like any other resampler. Download the compressed files, extract and use young3.exe instead of resampler.exe.

    Question:
    I don't understand the flags. Can you give a TLDR?
    Answer:
    In general, when using "young3", use the S flag globally. Render the vocals once, e.g. Play Region. Place the D flag on any bad sounding notes and end breaths. Use the M flag if you want the resampler to run faster. However, I recommend against using the M flag since it often tends to diminish the solid tone that I like.
    But that's just my preference.

    Question: Will this resampler work in other platforms such as macOS and linux?
    Answer:
    I have only tested this with windows but, assuming tn_fnds works in other platforms, it should likely work as well.
    Feel free to test it out.

    Question: I have used tn_fnds before but it sounds wonky with my voice bank. Will this resampler work with my VB?
    Answer:
    This resampler actually uses the updated DIO algorithm together with CHEAP TRICK whenever the D flag is used so the f0 estimation may be better than the original tn_fnds. You can also use the M flag to help mitigate estimation errors. But I'll be pessimistic and say no. However, this should not stop you from trying out "young3" since UTAU is all about experimentation.

    Question: I am using a monopitch bank and I don't know how to use the p flag because I don't know what pitch I recorded at. What do I do?
    Answer:
    Using the p flag is not a must to run "young3". If you really want to use the p flag, programs such as vocalshifter can be used to find out the pitch. If you are still confused, feel free to DM me your voice samples and I'll tell what p flag to be used.

    Question: Why did you name the resampler "young3"?
    Answer:
    I named it as such because "young3" is a branch of tn_fnds. The word "young" could also sound like YANG which stands for Yet ANother Generalised. The true reason for the name is actually just a reference.

    Feel free to contact me at my twitter @UtauZany for any questions

    Conclusion
    I have been reluctant to release this resampler since it's mostly just a copy of tn_fnds with a few quirks.
    I am also not confident with my coding capabilities and have very little knowledge about the speech synthesis process as well.
    However, sharing with the community is something I felt like doing so here I am.
    I hope you all have fun trying out this resampler and to anyone who is reading this, hope you have a nice day!
     
    Last edited: Feb 4, 2019
    小_Victor, Tema, sangv and 9 others like this.
  2.  
  3. sangv

    sangv Teto's Territory

    Messages
    13
    Likes Received
    27
    Trophy Points
    17
    I like how it sounds a lot so far! I did find some bugs though, kind of unsurprisingly since it's experimental,

    the first one I found is that it seems to crash on any notes ending in [R], like [ang R] was one of the notes I had it happen with. I only tested it with two voicebanks, Xia Yu Yao and JOAN, but with both it happened on every note I had that ended in [R].

    the second is that with some consonants like sh, young3 will sometimes turn the whole note into just a buzzing sound. In this case I tested this with more voicebanks to make sure it wasn't just the first two I tried it with, and it also happened to Namine Ritsu Eve, and Kikyuune Aiko RockLoud JP (with this VB I actually got it to happen on た as well). I couldn't get this to happen with Kikyuune Aiko BalladSoft, though. KYE CV seems to be affected by this glitch in a kinda weird way, I didn't have anything happen on し, but it happened with た. I also had young3 crash on あ with this voicebank, which seems really weird to me.

    EDIT: shoot, right after posting this I noticed that you said you've stopped modifying resamplers. sorry for bothering you about this, then :sad:
     
    Zany and Kiyoteru like this.
  4. Zany

    Zany Teto's Territory Defender of Defoko

    Messages
    16
    Likes Received
    35
    Trophy Points
    18
    It's alright! I stopped modifying young3 because it just takes too long to render compared to other resamplers. However, I may implement some of its code in another resampler I will likely be releasing in the future so your feedback is well appreciated! Thank you so much!
     
    sangv likes this.
  5. Zany

    Zany Teto's Territory Defender of Defoko

    Messages
    16
    Likes Received
    35
    Trophy Points
    18
    Decided to give young3 a slight revamp to at least make it a usable resampler. Take note that the previous version has been deleted.

    Feel free to download the updated "young3" resampler here: https://www.mediafire.com/file/7kr8yckc2a9rz44

    The updated flag list is in the readme file as usual. I have also included a Japanese translated flag list. However, it is in broken Japanese cause it is partly google translated. I would like to say sorry in advance for that.

    Some notable flag updates include:

    · B flag (0~100 def50)
    Will function similar to WARP's B flag for values below 50.
    Retains tn_fnds's B flag for values above 50.

    · O flag (-100~100 def0)
    Changed the equalisation away from a RIAA curve to a more low pass filter-ish one. So only high frequencies will be boosted with higher flag number, vice versa. Low frequencies are not suppressed anymore.
    Feel free to contact me if you want a version with the original O flag.

    · W flag (-2, 0~1000 def-1)
    Similar to my 2 other tn_fnds edits, the W flag will automatically use .frq average frequency values if W is specified without any number.
    I believe it to be quite useful in correcting any bad sounding notes.

    · M flag
    M flag which can be used to speed up HARVEST will also now prioritize .frq average frequency values.
    Take note that with faster rendering, the quality of output may be a bit lower, in my opinion.

    Although young3 can predict frequency values using HARVEST, I implore you to generate .frq files before using the resampler.

    Updated FAQs
    Question: I don't understand the flags. Can you give a TLDR?
    Answer:
    Although there might be a few bugs that I have yet to find, young3 will now work decently without the need of any extra flags. You may want to use the M flag to speed up the rendering but that is entirely up to you. B0 is also now an acceptable flag to reduce breathiness.
    If errors do pop up, most of the time you can fix it with the W flag.

    Question: Do I still need to care whether I am using a multipitch or a monopitch voice bank to use young3 properly?
    Answer:
    No worries. As long as the .frq files are available in the same directory as your voicebank, young3 will work as intended.

    Question: I have used tn_fnds before but it sounds wonky with my voice bank. Will this resampler work with my VB?
    Answer:
    Yes, it will! HARVEST is more robust than DIO and I believe you can fix most errors with the W flag, provided your UTAU frequency files (.frq) are generally accurate/not many mistakes. But I'll advice to avoid using W flag globally because STONEMASK does not predict vuv so the b flag (consonant emphasis) will not work as intended.

    I will now admit a mistake with my previous FAQ, STONEMASK is used, not CHEAPTRICK.

    Talking about CHEAPTRICK, the world4utau update (which uses it) is still in the works. Got a little sidetracked revamping young3.
    Feel free to contact me at my twitter @UtauZany if you face any problems or have any feedback.

    TLDR: New young3 update is up, it is more stable now! I hope you have fun testing it out!
     
    Last edited: Jun 15, 2019
    sangv likes this.

Share This Page