Resource icon

Anatomy of the OTO

This is not a guide to tell you how to OTO any particular type of bank. Rather, my hope is that by explaining how it actually works, you can gain a better understanding of what you're doing with an OTO, and use that to improve and innovate on your work.


In the top row is the view without envelopes, the view in UTAU by default.
The middle row has the view with envelopes, the view when you switch to Mode 2 or click the tilde (~) button in the bottom left. The shape of the envelope reflects the audio level or volume of the note. When slanting up, it fades in, when it's horizontal it's steady, and when slanting down it fades out. The beginning and end of the envelope do not necessarily match with the beginning and end of the note.
The bottom row is the view in the OTO editor itself. The lines from the oto editor have been overlaid on the note drawings to show where those parameters would fall in the context of an actual UST.

Current note: Pink note, visible in all three rows
Previous note: Blue note, visible only in the top two rows
Following note: Not pictured. Has the same relationship with the current note, that the current note has with the previous note.

The left blue region is the offset. In UTAU-Synth, it has the same name.
This is where the start of the envelope is, and is often before the start of the note. Moving this around allows you to include or cut out audio from the beginning of the file. It is measured in milliseconds relative to the beginning of the file.

The pink region is called the fixed region, but it is referred to as fixed in UTAU-Synth. This is because this is the part that does not stretch or shrink when the length of the note is changed in the UST. However, it is presumed that consonant velocity, a per-note value, is controlling that. It is measured in milliseconds relative to the offset.

The right blue region is the cutoff, called the blank in UTAU-Synth. This is where the end of the note as well as end of the envelope are. However, when the note is short, the end of the note may be sooner than the cutoff. When the following note has a large preutterance and/or small overlap, the end of the envelope may also be sooner than the cutoff. Changing this parameter allows you to include or cut out audio from the end of the file. It is measured positively in milliseconds relative to the end of the file, or negatively in milliseconds relative to the offset.

The red line is preutterance in both PC UTAU and UTAU-Synth. This is where the beginning of the note is. Anything between the offset and the preutterance is before the beginning of the note, and "intrudes" on the space of the previous note. It is measured in milliseconds relative to the offset.

The green line is the overlap in PC UTAU and UTAU-Synth. This determines the amount of overlap between the envelopes of the previous note and the current note. Envelopes between notes can be crossfaded: the previous note fades out at the same time the current note fades in. The visual result is an X shape between the envelopes. The end of the previous note's envelope is the current note's overlap. When the overlap is larger than the preutterance, the previous note is still audible when the current note has already started. When the overlap is a negative value, it causes the previous note's envelope to end before the current note's envelope ever starts. It is measured relative to the offset.

The white space between the fixed region and the cutoff/blank has no name, but is nevertheless important. It is the segment that is stretched or looped by the resampler when the note length is longer than the original audio recording.

When accessing note properties in a UST, it is possible to change the preutterance and overlap. STP adds that number of milliseconds to the currently set offset.

I've heard that, based on the actual lengths of the notes in the UST, these parameters may change. I don't know the exact numbers or ratios, so it would be a much appreciated addition to this resource if that's properly found out. Additionally, the function of consonant velocity was speculation on my part. Precise information would also be appreciated.

Because of the relationship between OTOs and envelopes, I may add information about the envelope buttons and envelope editor in the future.

Thank you for reading!
First release
Last update
5.00 star(s) 2 ratings

More resources from Kiyoteru

Latest updates

  1. Renaming parameter

    The "consonant" parameter is now called the fixed region, which correctly reflects the original...

Latest reviews

This was a very interesting read and I'll definitely be saving for future use. I like that the OTO is explained in a very clear and concise way, and that all the terms are broken down in a way that's easy to understand for beginners. This makes me feel like I could actually attempt OTOing because I have a much better understanding of how the OTO works.