There's three exes that make UTAU work: The main utau.exe, the resampler, and the wavtool. The main one is just a window that helps you edit USTs, do OTOs, etc. But when you're actually making the audio, that's the job of resampler and wavtool. UTAU gives the ust to resampler, which reads the ust and edits the voicebank audio. It'll change the length of samples, pitch it up and down, and do other processing according to flags. Then it sends all of the separate audio to the wavtool, which stitches it all together.
There are many resamplers and a few wavtools out there that do the same job, but in different ways, since they're made by different people. There's also Moresampler, which can do the job of both a resampler and a wavtool, as well as having a lot of brilliant flags (for example, you could make a single bank sound both soft and strong without needing any appends).
As a mac user, I also use UTAU-Synth. I'm kind of lazy, so while I could use PC UTAU, I use UTAU-Synth out of convenience. It doesn't allow you to change resamplers, which doesn't bother me too much.
What matters more to me is the voicebank, the UST, and the mixing. If the voicebank has clear samples and is recorded consistently (same tone of voice, same pitch), it'll end up sounding a lot better. Because I work with English voicebanks, there's lots of opportunity for such complicated USTs to be lower quality. I usually make my own, so that I can make sure they sound smooth, and fit the voicebank well.
Mixing is the process of bringing together all the different instruments in a song so that they sound good together. Tools such as compression (regulating the loudness) and equalization (changing loudness of certain audio frequencies) are mainstays, and reverb/delay (echo) are common as well. Most UTAU users make covers, so it's just a matter of adding the vocals to the instrumental. You can look up vocal mixing tips, but since you're working with a synthesized voice and not a human, it's a whole lot easier. For example, UTAU vocals generally don't need to be compressed at all, and you don't need to do pitch correction or remove extra breathing noises. On some parts of songs, you can use effects like chorus, distortion, and filtering, to make them stand out and sound interesting.