The words that I booted include adumbrate, opsimath, clerisy, deracinate, epigone, cenacle and cantle. Some of which I must have known at some time in my life but have now decluttered from my head and all of which we can all manage quite well without. Indeed I guess I am an opsimath, without knowing it, so today I feel a bit like the chap in Moliere who is surprised and delighted to find that he's been speaking 'prose' all his life. The difficult words, including those in the list above, are estimated to occur less than 3 times in 1 million words.
My pal and contributor-of-comments Russianside mentioned a few days ago that it was the first time he'd written the word ooze. With The Wean having just turned 2, her parents were asking how on earth their nipper had learnt <such a word> (I forget the word that made them comment). They'd never used it in her presence, the child can't read yet and they couldn't imagine anyone in the creche using such a term. I remember a similar event when Dau.I was just able to walk: she pointed to the roof and said "chimney". Of all the words in the world to start cluttering her head with, that wouldn't have occurred to me and we adults were all certain that it hadn't come up in conversation. A key predictor of a large vocabulary is, predictably, that a child reads "a lot". The American/Brazilian team is looking for more test-takers, especially from teenagers and others whose vocabular is like to be increasingly rapidly, so ask your teen to give it a go. More data are good data.
I can't finish this discussion without mentioning Zipf's law which says that "the frequency of any word is inversely proportional to its rank in the frequency table". Thus, the most common word in any language occurs twice as often as the second most common, 3x more than the third, 4x than the fourth. Bizarre but empirically true. Zipf's Law applies in a wide range of other circumstances: the ranking sof urban populations, income distribution, product sales in shops, youtube or blog pageviews. Here's some data, showing that English text consists of a small number of very common words and a large number of uncommon ones - the latter is the Long Tail which is opening up the economics of publishing (Danny Battle can make some sales as well as Harry Potter, if enough readers [Amazon, we thank you] are given the choice). If you actually look at the words which make up the dataset below, you can have a punt about which long 19th century novel supplied the 200+,000 words.
The Blob has accumulated about the same number but on a wider diversity of topics, so it's not so surprising that adumbrate deracinates cantle.