It's out!
I've updated world4utau with the latest software.
Feel free to download it here: https://bowlroll.net/file/203064
There are a few things you need to take note of when using this version of world4utau (w4u).
First, w4u will rely partially on .frq files. So please ensure they have been generated beforehand.
A personal tip from me:
Please do not use SpeedWagon to create .frq files if your voicebank is a power-scale one.
Use fresamp or resampler10 instead.
Next, w4u will produce three files, ".dio", ".ctspec" and ".d4c" when rendering.
This is similar to the original w4u but is slightly different since the ".ctspec" and ".d4c" files are the replacement for ".star" and ".platinum" respectively.
These files are generated automatically the first time you use w4u as a resampler. They work similar to moresampler's .llsm to speed up the rendering. You may delete them but it will slow down the rendering significantly since re-generating the files take a long time.
The ".dio" files, however, are still the same and are still compatible with nmasao1's frqeditor plugin.
I also ported a few flags from other resamplers.
· R flag
Use this flag to regenerate .dio, .ctspec and .d4c files
Advisable to use after editing .frq files.
Works similar to TIPS R flag.
· B flag (0~100 def50)
Breathiness flag ported from other resamplers.
Will function similar to WARP's B flag for values below 50.
Retains tn_fnds's B flag for values above 50.
· O flag (-100~100 def0)
One-to-one port of tn_fnds's O flag
Change the voice "brightness". Specifying + values suppresses lower frequencies and amplifies higher ones.
This next part is quite important.
The default method of lengthening notes in w4u is stretching. I.e. world4utau is a stretching resampler
However, I implemented a loop system into w4u which you can activate using the e flag.
· e flag
Changes the rendering method from stretching to looping.
This is the opposite of tn_fnds.
By using the e flag, long notes may sound smoother.
Take note that, although I took reference from tn_fnds and EFB-GT when creating the loop based system, it was made with w4u's code in mind so the loops may not be as smooth since it is not a direct port.
The output of world4utau is similar to EFB-GT but the rendering time is around 10 times longer (averaging first time use and subsequent uses).
Then you may ask, why use world4utau?
Here are some reasons:
1) Better noise reduction. EFB-GT has engine noise while world4utau barely has any.
2) Slightly more accurate frequency estimation, comparable to moresampler's. It is also editable using nmasao's frqeditor.
3) The flags, especially the g flag. The one in EFB-GT does not do formant conversion (gender change)
4) It is a stretching resampler which makes it a good resampler for other language VBs that aren't Japanese. I also believe that world4utau's stretching is better than tn_fnds's.
I am also releasing another world4utau edit.
I call this creation worldend4utau AKA wn4u
Dl link here: https://bowlroll.net/file/203027
I am releasing this resampler as a proof of theory type of thing. In general, most suggestions to make a resampler is to port the latest WORLD one-to-one as a resampler. In theory, it seems good but I believe otherwise.
wn4u has a way longer rendering time than w4u since it uses HARVEST for its F0 estimation.
So take note that it produces ".harvest" files instead of ".dio".
HARVEST is a good F0 estimator but it is not suited to be used in a UTAU resampler (in my opinion) due to how long it takes to estimate F0. There are also a few other flaws as well which I noticed when creating young3.
But hey, that's just my opinion! Feel free to try out wn4u.
In general, wn4u will produce slightly more realistic vocals than the w4u update above due to the fact it uses the latest d4c algorithm. (I used conventional d4c for w4u because I felt it was more robust)
Due to this reason, I added an additional flag, n (case sensitive)
· n flag
Use only if synthesized output sounds bad.
Will convert aperiodicity approximation back to conventional one.
Take note that this will automatically activate the R flag.
Use only on specific notes.
Other than that, wn4u is identical to w4u.
Warning, using worldend4utau (wn4u) will overwrite world4utau (w4u)'s .d4c and .ctspec files, vice versa.
That is why I am releasing wn4u with a password. The password is young3
That's that. Thank you for reading this long post. Have fun!
I've updated world4utau with the latest software.
Feel free to download it here: https://bowlroll.net/file/203064
There are a few things you need to take note of when using this version of world4utau (w4u).
First, w4u will rely partially on .frq files. So please ensure they have been generated beforehand.
A personal tip from me:
Please do not use SpeedWagon to create .frq files if your voicebank is a power-scale one.
Use fresamp or resampler10 instead.
Next, w4u will produce three files, ".dio", ".ctspec" and ".d4c" when rendering.
This is similar to the original w4u but is slightly different since the ".ctspec" and ".d4c" files are the replacement for ".star" and ".platinum" respectively.
These files are generated automatically the first time you use w4u as a resampler. They work similar to moresampler's .llsm to speed up the rendering. You may delete them but it will slow down the rendering significantly since re-generating the files take a long time.
The ".dio" files, however, are still the same and are still compatible with nmasao1's frqeditor plugin.
I also ported a few flags from other resamplers.
· R flag
Use this flag to regenerate .dio, .ctspec and .d4c files
Advisable to use after editing .frq files.
Works similar to TIPS R flag.
· B flag (0~100 def50)
Breathiness flag ported from other resamplers.
Will function similar to WARP's B flag for values below 50.
Retains tn_fnds's B flag for values above 50.
· O flag (-100~100 def0)
One-to-one port of tn_fnds's O flag
Change the voice "brightness". Specifying + values suppresses lower frequencies and amplifies higher ones.
This next part is quite important.
The default method of lengthening notes in w4u is stretching. I.e. world4utau is a stretching resampler
However, I implemented a loop system into w4u which you can activate using the e flag.
· e flag
Changes the rendering method from stretching to looping.
This is the opposite of tn_fnds.
By using the e flag, long notes may sound smoother.
Take note that, although I took reference from tn_fnds and EFB-GT when creating the loop based system, it was made with w4u's code in mind so the loops may not be as smooth since it is not a direct port.
The output of world4utau is similar to EFB-GT but the rendering time is around 10 times longer (averaging first time use and subsequent uses).
Then you may ask, why use world4utau?
Here are some reasons:
1) Better noise reduction. EFB-GT has engine noise while world4utau barely has any.
2) Slightly more accurate frequency estimation, comparable to moresampler's. It is also editable using nmasao's frqeditor.
3) The flags, especially the g flag. The one in EFB-GT does not do formant conversion (gender change)
4) It is a stretching resampler which makes it a good resampler for other language VBs that aren't Japanese. I also believe that world4utau's stretching is better than tn_fnds's.
I am also releasing another world4utau edit.
I call this creation worldend4utau AKA wn4u
Dl link here: https://bowlroll.net/file/203027
I am releasing this resampler as a proof of theory type of thing. In general, most suggestions to make a resampler is to port the latest WORLD one-to-one as a resampler. In theory, it seems good but I believe otherwise.
wn4u has a way longer rendering time than w4u since it uses HARVEST for its F0 estimation.
So take note that it produces ".harvest" files instead of ".dio".
HARVEST is a good F0 estimator but it is not suited to be used in a UTAU resampler (in my opinion) due to how long it takes to estimate F0. There are also a few other flaws as well which I noticed when creating young3.
But hey, that's just my opinion! Feel free to try out wn4u.
In general, wn4u will produce slightly more realistic vocals than the w4u update above due to the fact it uses the latest d4c algorithm. (I used conventional d4c for w4u because I felt it was more robust)
Due to this reason, I added an additional flag, n (case sensitive)
· n flag
Use only if synthesized output sounds bad.
Will convert aperiodicity approximation back to conventional one.
Take note that this will automatically activate the R flag.
Use only on specific notes.
Other than that, wn4u is identical to w4u.
Warning, using worldend4utau (wn4u) will overwrite world4utau (w4u)'s .d4c and .ctspec files, vice versa.
That is why I am releasing wn4u with a password. The password is young3
That's that. Thank you for reading this long post. Have fun!
Last edited: