I'm going to be 100% honest with you. This voicebank sounds pretty good, actually, and I don't think the issue here is your voice or any vocal damage, and seems to just be the resampler choice. I'd reccomend using Moresampler, Fresamp or Doppelter. I know Kiyoteru reccomended tn_fnds, but I wouldn't reccomend that since it basically turns into the meme of Peter Griffin glitching out and getting sick after eating a rice cake when it has to handle anything longer than an eighth note.
Also ensure that you set the modulation/mod to 0 in any USTs, helps a TON with a VB that sounds off pitch, and fixes it about 99% of the time.
The default resampler has a bit of a habit of making voicebanks sound kinda screechy and metallic, regardless of the vocal texture of the voicebank it's trying to render. Grainier voices do tend to work fairly well in UTAU so long as you're not using the default resampler. Trei, Arachne, Ken Shippai, S.A.M. Intel, and Kikyuune Aiko are all voices with a bit of husk/graininess to them and they're all regarded as being pretty good quality, and who have a tone that could be considered as being in a similar ballpark to what you've linked. If you're wondering about why they might have that depth of sound in covers while your UTAU doesn't, it likely just comes down to mixing. All UTAUs sound a little flat and dull in the program until you start the mixing process. JOEZCafe has a pretty good mixing tutorial specifically about vocal synths.