An "ACT2" is simply an upgrade in quality, if you feel your voice bank is... Lacking something, I guess.
So, if you recorded with an inbuilt mic, but someone just got you a shiny new Yeti, that might call for an "ACT2" (or version 2, or 2.0, or whatever you want to call it!), simply to upgrade the quality.
Or perhaps with your first voice bank you had a good mic, but you missed a lot of sounds and mispronounced everything and it simply isn't the tone you wanted. That could also call for an upgrade.
As for an "append", it's simply an extra bank of sound files meant to help your UTAUloid sing other genres. So! Say you want your voice bank to sing Toeto, but their vocal quality is really powerful, so they're pretty much screaming and can't sing a song that soft, you might make a whisper or cute or small voice bank for them by singing the samples in a softer voice.
That's about it, I think.