English is a complex language, and there are many different ways that UTAU banks approach it. One of the most apparent differences is how sounds are written, and how you need to type the lyrics into the notes.
A few methods are particularly common. One of them is VCCV, and you can watch the tutorial for using voicebanks and editing USTs here. The phoneme system is custom made by the creator of VCCV, and isn't based on any existing standards. You will need to check their other videos to learn how it works.
Another is ARPAsing, which is based on ARPABET, a phoneme system used in almost all English speech and singing technology. You may also see the same system in other singing synthesizers, like CeVIO and Emvoice One.
Here's how to use an ARPAsing voicebank with the Assistant plugin:
https://arpasing.neocities.org/en/resources/usage-w-assistant.html
And how to use a voicebank without the plugin:
https://arpasing.neocities.org/en/resources/usage-wout-assistant.html
One more method that you may see sometimes is voicebanks recorded based on Delta's set of English reclists. These are encoded in X-SAMPA, which directly corresponds to the International Phonetic Alphabet. Vocaloid also uses X-SAMPA, though the transcriptions are slightly different.
Here's a guide on how to use Delta English voicebanks:
https://ch.nicovideo.jp/delta_kimigatame/blomaga/ar764836
For other methods of English, you will have to check whether the reclist creators have written official guides. However, as you gain experience using English voicebanks, you'll generally be able to figure out how to use any voicebank you encounter.
That said, if I have to recommend one specific voicebank for you to try out, I invite you to download KYE.