I know OP specifically stated VCCV in a later post but I'll post a more general answer here:
The length of English depends on the list, some lists have only a few hundred and some can have many thousands.
Generally the more the list covers within a certain vowel constraint the easier it is to actually use.
For example: a list with 1000 sounds and only 7 vowels would potentially be easier to use than a list with 14 vowels and only 1000 sounds.
That being said it also depends on how robust the list is, many lists follow a "completionists approach" where they just make as many combinations as possible without proof-reading. This means the list could have all the needed combinations at the expense of having many junk recordings that my potentially never be used.
That being said, an english recording list to start will be around 1000 sounds minimum for a somewhat complete-ish selection. Barring usability as a factor.