Halo

Icon by Wanpuccino @ DA
Administrator
Defender of Defoko
What tutorial? And is it definitely for sure ENUNU/NNSVS and not Diffsinger? (I'm sorry if this is obvious to you)
 

Halo

Icon by Wanpuccino @ DA
Administrator
Defender of Defoko
I'm following the NNSVS Voice Database Creation Tutorial linked on the NNSVS carrd. I don't have any experience with AI voices, so I can't say for sure that it hasn't been repurposed for Diffsinger. It definitely seems like it's for NNSVS though.
That is for diffsinger primarily now. It's been renamed to be a general SVS (singing voice synthesis) guide, but it touches on a lot of diffsinger specific info and neglects a fair bit of NNSVS info since it was outdated.

For example, yeah, it doesn't go into making USTs because you don't have to for diffsinger. For diffsinger, you would separate samples into different singers for parallel training so you could use them as voice colors/"flags". If you're going with NNSVS you do definitely need USTs and will need to manually add flags. I recommend joining the NNSVS discord server if you can, it's pretty derelict these days but there are still a few knowledgable people who will answer when you have a specific tech question.
 
  • Like
Reactions: Kawaiine Is Queen

Kawaiine Is Queen

Momo's Minion
Thread starter
but it touches on a lot of diffsinger specific info and neglects a fair bit of NNSVS info since it was outdated.
Was the info outdated or is NNSVS itself "outdated" compared to diffsinger? The NNSVS-specific version of the tutorial is on the wayback machine and I'm wondering if it would still work.
 
Last edited:

Halo

Icon by Wanpuccino @ DA
Administrator
Defender of Defoko
Was the info outdated or is NNSVS itself "outdated" compared to diffsinger? The NNSVS-specific version of the tutorial is on the wayback machine and I'm wondering if it would still work.
The tutorial was what I was referring to as outdated! And I believe the colab notebook it relied on is broken so you may need to either know how to make a colab notebook or train locally via command line. I think a group had wanted to maintain a new one, but as far as I'm aware they have not yet had the time.

Diffsinger is currently easier to train because there are more up to date tutorials and tooling, and you don't have to make USTs at all (it's automated), but you can use the same labels and audio data for both so if you prep for an NNSVS singer you can also train a Diffsinger pretty easily later down the line if NNSVS/ENUNU become more difficult to use. So it doesn't matter much which is better or outdated or anything, since the same data will be good with some minor edits.

The main issue I've been trying to get across is there's little to no up-to-date and public documentation on NNSVS > ENUNU and training. The old doc will probably not get you ideal results, if you're even able to get results, if only because it doesn't go deep enough into training locally. But, as long as you are in contact with an actual person who knows how to do it already, it will be fine! That's why I suggested joining the server as backup for when the old document outlines something and it just doesn't work or exist anymore, or if there's more options now and you don't know what they do.
 
  • Like
Reactions: Kawaiine Is Queen

Similar threads