All utaus I use sound bad even when I fit the ust correctly

Crikey77

Momo's Minion
I'm very new to UTAU and I have little to no knowledge but, I know how to use a ust...or so I thought. I fit the ust the way most guides say to, but the utau's singing still sounds gross and janky. I don't know if I'm doing something wrong, but could someone who knows a little more about this whole thing tell me how they usually do it?(if that makes any sense) I thought it would sound good because I usually use the more well-know uatus since I thought they'd sound good 50% of the time but, they sound good 0% of the time. I was tired when I came up with that logic, the quality doesn't always have to depend on the vb it's the way it's used, I guess.
 

Cyberbison

Ruko's Ruffians
Defender of Defoko
Usually there are some pre-set settings in the UST (Preutterance and Overlap in Note properties for example, which determine how much the note is "early" and how much the previous note should stay on over the new one). Also, what helps with faster songs, is to add 100 or 200 to consonant velocity. This speeds up the time the voicebank forms the consonant part of the sample.
So maybe try highlighting the whole song (CTRL+A on PC), right click and choose "note properties", and clear (don't click reset - that doesnt always bring the values to 0 but instead to the numbers they were set to when you downloaded the song!) the Preutterance and overlap -boxes, and try adding 50, 100, 150 or 200 to consonant velocity, this usually helps me.
 
  • Like
Reactions: Halo

Crikey77

Momo's Minion
Thread starter
Usually there are some pre-set settings in the UST (Preutterance and Overlap in Note properties for example, which determine how much the note is "early" and how much the previous note should stay on over the new one). Also, what helps with faster songs, is to add 100 or 200 to consonant velocity. This speeds up the time the voicebank forms the consonant part of the sample.
So maybe try highlighting the whole song (CTRL+A on PC), right click and choose "note properties", and clear (don't click reset - that doesnt always bring the values to 0 but instead to the numbers they were set to when you downloaded the song!) the Preutterance and overlap -boxes, and try adding 50, 100, 150 or 200 to consonant velocity, this usually helps me.
Thanks, I'll try and see if it makes it vocals any less janky.
 

Crikey77

Momo's Minion
Thread starter
Usually there are some pre-set settings in the UST (Preutterance and Overlap in Note properties for example, which determine how much the note is "early" and how much the previous note should stay on over the new one). Also, what helps with faster songs, is to add 100 or 200 to consonant velocity. This speeds up the time the voicebank forms the consonant part of the sample.
So maybe try highlighting the whole song (CTRL+A on PC), right click and choose "note properties", and clear (don't click reset - that doesnt always bring the values to 0 but instead to the numbers they were set to when you downloaded the song!) the Preutterance and overlap -boxes, and try adding 50, 100, 150 or 200 to consonant velocity, this usually helps me.
I know this is a little late but, would it help if I could send the .wav file and you could tell me what's exactly wrong? I might only think it sounds janky, I don't know. Sorry if this is any sort of a bother.
 
Last edited:

Cyberbison

Ruko's Ruffians
Defender of Defoko
Hey no worries! I'm no expert but I'll do my best to help you. Could you send a sample? :smile: Sorry for the delay, I had just gone to bed when you answered, haha!

UST:s in my experience take a fair piece of tuning, a clean UST and a good quality voice bank is a good beginning, but you still need to make sure the samples fit and slide into each other smoothly.
Two vowels wont glide into each other unless you tell them to, and so on! This is why it helps, at least me, to have the SUT "clean" before I start filling in my own settings. Soft consonant samples (like "na" or "sa") often need some tweaking in the envelope. Many sounds overlap when singing/talking, so it helps to fiddle with Preutterance and Overlapping and such settings!
Usually the UST is either tuned to sound good with a specific voicebank, or left clean, meaning there aren't any pitch glides etc, which make the voice sound natural. If the voice snaps into pitch 100% with no easing, the voice sounds robotic and "autotuned" (autotune's first and foremost job is to snap sound into pitch! tune it automatically. Autotune it). Sometimes long notes need vibrato so it sounds more natural.

There are some tuning tutorials you can follow on Youtube!

//EDIT: When right clicking a note you'll find a menu. I can't quite grasp some of the terms myself, but following tutorials for OTOing has helped a lot too.
Okay, don't get scared. It's a lot of text, for a few settings and actions. The best way to learn of course is to try and fail, but here are some stuff that has helped me with tuning.
Here's Kiyoteru's oto tutorial where the parts of OTO are explained well

Envelope - for VOLUME control (includes PRE. and OVL. boxes so you can fiddle them in the Envelope window too) - creating a slope in the beginning of the volume line will create a fade in, slope in the end will make one.)

PITCH - menu brings you many options
Preutterance - determines how early the sound is made, overlapping the end of the previous note
Overlapping - determines how much of the previous note is left on while the new one starts
Portamento - tying notes together with pitch happens here. I like to create a few points to create pitch slides,so the pitch of the note starts where the last note ended, and slides into the new pitch, or slide the note's pitch towards the following note's pitch.
Vibrato -
Shakes the pitch and volume

PROPERTY - menu brings you some options too, PRE and OVL are the same as found in Envelope window!
Preutterance - determines how early the sound is made, overlapping the end of the previous note
Overlapping - determines how much of the previous note is left on while the new one starts
^^^^^^^These are the same settings that are in the OTO.
Consonant velocity is the speed of the consonant part of the sample (the beginning.) Raising this number speeds up the pronouncing of the said note.

BRE - breathiness of the voice, "whisper" (most effective in high numbers, but often needs raising the volume too to compensate)
FLAGS are filters, some work more noticeably than others, some require specific resamplers (which are the tools used to render the sound files from the program) to work (properly)
STP is the "starting point" in the oto. Adding to this moves the starting point to a later spot.
 
Last edited:
  • Like
Reactions: Lycoris

Similar threads