OK, so I had this idea, you know how people make stuff like Justin Beiber UTAUs, or Lady Gaga UTAUs, well I had an idea that you could easily create a voicebank from any voice sample!
So here's my idea, you know how there is speech recognition software nowadays? Like, a computer can tell if you are saying "pancake" or "peaches". Well, I thought that you could utilize that technology in an "auto UTAU voicebank creator"!
So, in the program, you would choose a voice sample for the program to work with, such as an acapella by a singer. You would then choose a text file with each sound from a CV reclist on a new line. Then, it would scan the audio file, and use the voice recognition software to detect any of the sounds off the list, and save them to a file in a specified folder. After you've got all the sounds, you can set them all to the same pitch.
Here's and example of a scenario with this program:
I load up Avril Lavigne's acapella cover of "darlin". Then, I choose a text file that contains a CV reclist, with each sound on a new line. It scans the audio, and detects that Avril sang "ka". It would then save it as a wav file named "ka" in a folder called "Avril UTAU". But wait, it detected that Avril sang "shi" twice! Well, it then lets me compare and choose which one I like best. But not all of the sounds where found, well that's ok, I'll just load another audio file, and continue off the same list, because if the sound is already in the folder, it will ignore it. Then, after I'm done, I can use the program to set the pitch of all of the samples to C4! Tada! Now just oto in UTAU, and you're all good!
So, are there any programers out there that think this is plausible? I think this would be so cool! You could even use it on your friends and stuff and surprise them with an UTAU!
So here's my idea, you know how there is speech recognition software nowadays? Like, a computer can tell if you are saying "pancake" or "peaches". Well, I thought that you could utilize that technology in an "auto UTAU voicebank creator"!
So, in the program, you would choose a voice sample for the program to work with, such as an acapella by a singer. You would then choose a text file with each sound from a CV reclist on a new line. Then, it would scan the audio file, and use the voice recognition software to detect any of the sounds off the list, and save them to a file in a specified folder. After you've got all the sounds, you can set them all to the same pitch.
Here's and example of a scenario with this program:
I load up Avril Lavigne's acapella cover of "darlin". Then, I choose a text file that contains a CV reclist, with each sound on a new line. It scans the audio, and detects that Avril sang "ka". It would then save it as a wav file named "ka" in a folder called "Avril UTAU". But wait, it detected that Avril sang "shi" twice! Well, it then lets me compare and choose which one I like best. But not all of the sounds where found, well that's ok, I'll just load another audio file, and continue off the same list, because if the sound is already in the folder, it will ignore it. Then, after I'm done, I can use the program to set the pitch of all of the samples to C4! Tada! Now just oto in UTAU, and you're all good!
So, are there any programers out there that think this is plausible? I think this would be so cool! You could even use it on your friends and stuff and surprise them with an UTAU!