tn_fnds_PB is noisy

Arisa-chan · Nov 3, 2024

I have been scouring the internet trying to find way to make this voicebank work. Now I am running into an issue that involves added engine noise when this new version of a resampler should be better. Am I using the resampler wrong or missing more flags? I am happy to share more documentation like the flag yaml file and screen captures.

When rendering Maiko under tn_fnds_PB I get spot on pitch shifting, but all volume is lost along with significant engine noise like no tomorrow. If I try to use the original tn_fnds I get the velocity back, but lose all the pitch points along with badly stretched long notes. I have taken it upon myself to remove all of the expressions to see where the problem lies.

Example bellow shows tn_fnds_PB then tn_fnds raw wav export from OpenUTAU

Resampler: tn_fnds or tn_fnds_PB
Wavtool: wavetool64 (all though all other ones I have produce the same result)
Flags enabled: W-1 on all notes

Please tell me if you can pin point what steps I can take to remove the 'noise' when using PB.

tyvy · Nov 3, 2024

I think maybe this has to do with the frq files

Arisa-chan · Nov 3, 2024

tyvy said:
I think maybe this has to do with the frq files

I have noticed the option to redo the frq files when viewing the singer edit window. My only problem is I am unsure what each option means...

tyvy · Nov 3, 2024

We also use this to train rvc datasets, these options are the pitch extraction methods, look

Post automatically merged: Nov 3, 2024

The crepe pitch extraction method, if I'm not mistaken, is the intermediate one.

Arisa-chan · Nov 3, 2024

tyvy said:
We also use this to train rvc datasets, these options are the pitch extraction methods, look
View attachment 12784

Post automatically merged: Nov 3, 2024

The crepe pitch extraction method, if I'm not mistaken, is the intermediate one.

I decided to convert to each one, nothing changed in terms of noise with PB. Did not try under orignial version yet. Unless I am missing something after setting the new FRQ type and refreshing the Singer window.

Kiyoteru · Nov 3, 2024

Maiko is not meant for pitched vocals, so there's no need for frq files. W-1 should be disabling all pitch shifting.

Arisa-chan · Nov 4, 2024

Kiyoteru said:
Maiko is not meant for pitched vocals, so there's no need for frq files. W-1 should be disabling all pitch shifting.

If I remove the pitch points, would that remove the excess engine noise while using PB? I don't mind using the original tn_fnds, but the stretching it does on longer notes is very choppy or erratic. What would I do to fix that?

Kiyoteru · Nov 4, 2024

The advantage of tn_fnds over other resamplers for rough vocals is that it extends notes by looping. You will probably need to edit the oto.ini if the looping isn't smooth enough. Both classic UTAU and OpenUtau have built-in oto editors. The area between the fixed region (sometimes mistakenly translated as "consonant") and the cutoff is the section that gets looped.

Arisa-chan · Nov 4, 2024

Kiyoteru said:
The advantage of tn_fnds over other resamplers for rough vocals is that it extends notes by looping. You will probably need to edit the oto.ini if the looping isn't smooth enough. Both classic UTAU and OpenUtau have built-in oto editors. The area between the fixed region (sometimes mistakenly translated as "consonant") and the cutoff is the section that gets looped.

I am a bit confused if I am even setting up the Flags correctly. Because I see in the documentation for the resampler that from Zanny is [ W flag (50~1000 def 0) ]. With def=Default right? But I saw another thread where you demonstrated setting up the W flag as F0 with -1. Otherwise I used the Flag guide made by susrever to set up the others if they existed. Such as M and L also referenced in the resampler guide.

Kiyoteru · Nov 4, 2024

・Wフラグ(-1, 50~1000)
　デスボイス用のフラグです。本家のWフラグとは動作が違います。
　F0分析の結果を指定した周波数で上書きします。-1を指定すると無声音として扱います。
　原音のプロフィールの『freq avg』の値を基本として調整してください。

The default value of W is W0, which means that the resampler will estimate the pitch (or in the case of macres, you can use the f flag to read pitch from frq file)
Set W-1 to disable pitch analysis and pitch shifting.
Set other values of W to manually specify the base frequency. For example, if you set W440, the resampler will act as if the original pitch of the sample is 440Hz or A4.

Arisa-chan · Nov 4, 2024

Kiyoteru said:
・Wフラグ(-1, 50~1000)
　デスボイス用のフラグです。本家のWフラグとは動作が違います。
　F0分析の結果を指定した周波数で上書きします。-1を指定すると無声音として扱います。
　原音のプロフィールの『freq avg』の値を基本として調整してください。

The default value of W is W0, which means that the resampler will estimate the pitch (or in the case of macres, you can use the f flag to read pitch from frq file)
Set W-1 to disable pitch analysis and pitch shifting.
Set other values of W to manually specify the base frequency. For example, if you set W440, the resampler will act as if the original pitch of the sample is 440Hz or A4.

So in order to set the freq manually I would change the min-max entries. Am I even doing X, M, L correctly?

Berrweary · Nov 4, 2024

i usuallyusally use the "e" flag in the resampler flag.
IT removes all pitch shifting but also kinda just plays the wav file

Kiyoteru · Nov 4, 2024

For tn_fnds and derivatives, the e flag will force stretching instead of looping, which is not what you want for rough vocals like grit/scream/whisper

Arisa-chan · Nov 4, 2024

Are my flags set correctly to work with tn_fnds_PB? Even with W-1 enabled, PB sounds no different than before. Which is very metallic and muffled.

Kiyoteru · Nov 4, 2024

Does the original tn_fnds work as expected? Have you adjusted the looping region of the OTO?

Arisa-chan · Nov 4, 2024

I have become so turned around that I am not sure what I am doing anymore. Reverting to tn_fnds original for my own sanity. I have the flags set the same way above and am still getting the 'ambulance' effect. Tried messing with the OTO with very little luck...

Zany · Nov 9, 2024

tyvy said:
I think maybe this has to do with the frq files

tyvy is correct and wrong at the same time. For any other tn_fnds, the .frq file will not affect them. In tn_fnds_PB's case however, it does set a base f0 value based on the .frq file of the VB. This might sound strange, but to use tn_fnds_PB with Maiko, you'll need to delete the .frq file. This sets the base f0 to 0.

So yes, the solution to the problem is to delete the .frq file for the notes that has that "noisy problem".

Kiyoteru's advice to use W-1 is extremely sound but is applicable only for the original tn_fnds. I made tn_fnds_PB with singing VBs in mind so it'll actually set a f0 value regardless (maybe I should change it but idk)

Also feel free to try out other tn_fnds derivative resamplers such as
young3 : https://bowlroll.net/file/203018
macres : https://github.com/titinko/macres

There's a lot of experimentation involved when using UTAU so if you ever feel turned around, take a short break and try again whenever you have the motivation for it.

Arisa-chan · Nov 9, 2024

Did some more experimenting today. Macres seems to work better, but I am still getting a lot of wobble in the elongated note of さ. I have deleted the FRQ related to that note and now it sounds very low and almost gender shifted down or demonic. No flags have been enabled on this note when testing Macres.

On another note, I do worry I am not setting the flags correctly for XML features on the other versions. Could someone check my post and tell me if they are done right?

Zany · Nov 11, 2024

Arisa-chan said:
Did some more experimenting today. Macres seems to work better, but I am still getting a lot of wobble in the elongated note of さ. I have deleted the FRQ related to that note and now it sounds very low and almost gender shifted down or demonic. No flags have been enabled on this note when testing Macres.

On another note, I do worry I am not setting the flags correctly for XML features on the other versions. Could someone check my post and tell me if they are done right?

There is no reason for the flags XML to be used since they are pitch correction flags, which will not work with Maiko.
As stated by Kiyoteru, just use W-1, W0 or no flags at all for the best result in my opinion.

I tried rendering さ with tn_fnds variants and I do understand your plight. I can't offer much but another possible solution is to use WARP as the resampler. I have linked the download to WARP below. Take note that the W flag does not exist for WARP.

WARP : http://custom-made.seesaa.net/article/312530509.html

The problem here though is that WARP requires good .frq files to work and that is difficult to achieve with a mostly unvoiced VB such as Maiko.
So to save you some time, I have rendered and edited some of Maiko's .frq files to help you. They are not perfect but they'll do the job I think.

Edited .frq for Maiko :

I have also added in the oto I've used so feel free to use that too. However, I recommend you to backup whatever oto you have first.

As Kiyoteru mentioned, you do need to adjust the oto to get a good looped portion, please attempt that. I reckon it might be easier to find it using the .frq I provide as a guide but as I said before .frq with Maiko are usually not accurate so it is still up to your own discretion.

hottopic_wannabe · Nov 12, 2024

I'm not sure if this is the place for this question, but what would be the difference between using Maiko with tn_fnds and using her voicebank with moresampler and with the U and Me flags for example? I was having some problems with tn_fnds so I just started doing that instead

tn_fnds_PB is noisy

Ruko's Ruffians

Momo's Minion

Ruko's Ruffians

Momo's Minion

Ruko's Ruffians

UtaForum power user

Ruko's Ruffians

UtaForum power user

Ruko's Ruffians

UtaForum power user

Ruko's Ruffians

Teto's Territory

UtaForum power user

Ruko's Ruffians

UtaForum power user

Ruko's Ruffians

Teto's Territory

Ruko's Ruffians

Teto's Territory

Teto's Territory