The things you'll need to make a proper UTAU:
- recording software: audacity is a fine start [personally, I only use it for creating silence in prerecorded samples], but unless you have a good ear for pitch and are good at keeping consistency, I would recommend looking into a proper daw that can monitor input and support VSTs such as GTune. Another reason to look into a proper daw is that for mixing, you want something that doesn't do destructive editing like audacity does (meaning, you won't be stuck with a bad use of EQ, reverb, etc. if you accidentally save or something). If you want to make USTs from midi or original music, that's another reason to pick a proper daw. Oremo is also fine (for just UTAU recording) if you want something quick.
- a reclist (you can use someone else' or create your own), as well as recording guides, learning linguistics (the difference between un-aspirated, aspirated, etc.), the alphabet, and perhaps the lexicography of the language(s) you want to record.
- a mic: laptop mics, rockband mics, karaoke mics, etc. are really low quality, but are okay to start with if you don't have a budget or are just getting started on your first bank (no one starts UTAU with proper recording equipment; everyone remembers their first laptop-mic recorded bank). Your first bank probably won't sound precisely the way you want it to, have a lot of noise, be tinny, etc. but it's a start as you learn how to use the program.
- practice, practice, practice: practice using the program, configuring, using different bank types, recording different bank types, practice recording your voice to get it to do what you want. Learn your voice's limitations and what's necessary to record in a healthy way as to not deteriorate your sound over time. Practice voice-acting and observe what you're capable of.