• If you do not recieve your confirmation email within a few hours, please email haloutau@gmail.com with your username for manual validation. Your account should be activated within 24 hours.
    You may also reach out via any other listed contact on Admin Halo's about page: https://utaforum.net/members/halo.194/#about

New English UTAU clone

isengaara

Teto's Territory
Defender of Defoko
UTAU is designed to support Japanese, but it does not handle English very well. Other synthesizers such as eCantorix can handle other languages including English and German. Therefore I took the already existing QTau synthesizer and begin writing an eSpeak synthesizer plugin. As eSpeak is designed for speech and not singing I decided to use the WORLD speech toolkit with eSpeak to add singing support. It is also possible to do phonetic transliteration with eSpeak as VOCALOID does. Here is an example how this new synth looks.

espeak_utauloid.png
 

isengaara

Teto's Territory
Defender of Defoko
Thread starter
That looks really cool! Is 'WORLD' working well? (more specifically, is it smooth?)

WORLD is a modern speech toolkit, is has been used in different UTAU resamplers and it produces a clean sound. vConnect-STAND, which uses an old version of WORLD includes smooth transitions between phonemenes. When I started working with WORLD all documentation was in Japanese, but today English documentation is avialable. Some of the papers that I had found are in Japanese, and I am currently working on an English translation.[DOUBLEPOST=1419844597][/DOUBLEPOST]
and.... this synthesizer could have a TTS configuration?
(Text to Speech)
The same voices can be used for both speech and singing.
A speech example is here and a singing synthesis example here.
It is also possible to convert speech to singing using STRAIGHT, which is similar to WORLD.
 
Last edited:
F

Fuutari Makku

Guest
This looks awesome , even though the quality doesn't seem really good. Will it have support for other languages , like japanese , italian , spanish , etc...?
 

na4a4a

Outwardly Opinionated and Harshly Critical
Supporter
Defender of Defoko
You are using speech sources and using them for singing? This is really neat, though honestly, I feel a different vocal source would be better. (Maybe a singing source)

Will the use of espeak allow people to record their own samples and pull phonemes from it? Or are we restricted to the speech voices?
Because I don't feel that those voices are really up to the task of singing.

From the looks of it, I really like where this is going! I have used vConnect Stand and found that the results are really smooth, so I can't wait to see what happens.

So, are you modifying Qtau to handle phonetics? Or was this possible prior?
 
Last edited:

Similar threads