Well, when recording an English bank, you wouldn't want to just record single words like "monkey" or "phenomenon" would you? Otherwise you'll be recording for a very long time. Same could be said for Japanese. Or any language. You can't just record full blown words or we'd never get anywhere. So kanji is out the window. Because for the most part, kanji represents full words and phrases.
UTAU vocals(and I'm sure other vocal synthesis programs) are recorded in syllables. ah, ee, oo, eh, mE, sA etc. You could record them separately or in a string, but they're just syllables nonetheless. The sounds that we use to make full words in any language. You record the syllables so that in the program you can move them around and put them together however you need to form the words you want.
And hiragana is just how the Japanese write their syllables. It's the first form of their writing system. They then combine it together with kanji and katakana to make full words, sentences, and meanings. They don't use romaji as often as we would or do because why should they, honestly. It's not needed.
In theory, you could actually alias a bank in katakana. In practice though, it's not necessary because most Japanese people just don't go about writing their sentences in katakana. It's used mostly for loan words or emphasis as has been said.
Which actually does bring me to aliasing. Often times us overseas users record our Japanese banks in romaji and then alias the sounds in hiragana for the best of both worlds. That way the bank works with both forms and most users can use the bank without a problem.