The engine hasn't really changed much but tbh it's more of an issue with the voices rather than the engine. Sure the engine could probably be improved upon but are they gonna?
I doubt it.
I'm gonna digress a bit...
For Vocaloid to be successful, they'll need to actually deliver something.
For a "professional" product it sure has a lot of bugs and an absolute lack of good voices.
I mean, none of the English voices sound good from a non-vocutau user's perspective. Poor tone, unwanted accent, weak/brittle, nasally, slurry, or a combination of it.
Not to mention the poor standards on the voices. I mean you can actually hear the noise when you use certain voices.
A situation that would of been entirely unacceptable with recording a real vocalist for mixing.
From the looks of it, Vocaloid is an easy ride for Yamaha?
They license it out for companies to use with probably little to no effort on their part beyond that. They could just let Vocaloid rot without real updates and as long as it still sells, are being produced, and they get a good profit margin then they'll beat the dead horse 'till it's ribs cave in.
All they'll need to do is mess with the UI and release the same technology over and over again and people will still buy it.
I get that Vocaloid is mature and much of it is set in place but I'd hate to see it reach a sort of "development death" where it is left to fester and only gets compatibility updates maybe once every several years....like a lot of software goes through.
I think something cheaper, like Utau (that's right, Utau isn't free. You should think about supporting it), actually does pose a threat to Vocaloid as the quality is there. Just people aren't utilizing it in a way that sounds natural and is convenient to use for an end-user.
imo no public method of making voicebanks is very natural and they all aim for an exaggerated speech tone while being more abrasive than a tts voice.
tl;dr
So the engine could use some tweaks maybe but tbh that won't help if the voices are still trash.
And cheaper stuff poses a threat so that's always a possibly as well. But it all sucks anyways