Premade USTs will always have the lyrics in them. However, you may find them in a variety of formats (romaji, hiragana, CV, VCV, CVVC, and that's just Japanese alone!) There may also be extra samples, such as vocal fries and glottal stops and end breaths. For converting formats, you may find it more convenient to have an installation of Windows UTAU via something like Wine/Crossover or a VM/Parallels/Bootcamp. But it's still possible to edit things by hand.
One convenient feature of UTAU-Synth is that if the UST is in CV Hiragana, you'll be able to use it with VCV voicebanks without needing to do any conversions. CV Hiragana is also a great starting point for working with CVVC Japanese banks, so that's what you should aim to get.