Was the info outdated or is NNSVS itself "outdated" compared to diffsinger? The NNSVS-specific version of the tutorial is on the wayback machine and I'm wondering if it would still work.
The tutorial was what I was referring to as outdated! And I believe the colab notebook it relied on is broken so you may need to either know how to make a colab notebook or train locally via command line. I think a group had wanted to maintain a new one, but as far as I'm aware they have not yet had the time.
Diffsinger is currently easier to train because there are more up to date tutorials and tooling, and you don't have to make USTs at all (it's automated), but you can use the same labels and audio data for both so if you prep for an NNSVS singer you can also train a Diffsinger pretty easily later down the line if NNSVS/ENUNU become more difficult to use. So it doesn't matter much which is better or outdated or anything, since the same data will be good with some minor edits.
The main issue I've been trying to get across is there's little to no up-to-date and public documentation on NNSVS > ENUNU and training. The old doc will probably not get you ideal results, if you're even able to get results, if only because it doesn't go deep enough into training locally. But, as long as you are in contact with an actual person who knows how to do it already, it will be fine! That's why I suggested joining the server as backup for when the old document outlines something and it just doesn't work or exist anymore, or if there's more options now and you don't know what they do.
Discord is great for playing games and chilling with friends, or even building a worldwide community. Customize your own space to talk, play, and hang out.
discord.gg