Science matters: Hypnopompic is a very long word

Friday, 10 January 2014

Hypnopompic is a very long word

Hypnowha'? is the most likely response to the title of this post. Peculiarly attentive readers of The Blob may just ask "where have I seen that word before?". Me, I was delirah because in December I wrote the word for the first time in my life on The Blob and then came across it as part of a Test Your English Vocabulary quiz. Accordingly I have an estimated vocabulary of more than 40,000 words. This puts me in the 90th percentile of the people who take the test. Of course this isn't a random selection of people but rather a selection from the intersect between a) people who do quizzes on the internet and b) people who are interested in (their) language skills. Draw a Venn Diagram! I found myself doing the quiz because Dau.II, who went back to Cork last week (sniff, gulp) was asking me about how big a person's vocabulary was. I claimed that the average vocabulary of a native speaking adult is about 10,000 words and that a medical students double that in the course of their training (digitalis, oxalic, catheter, trochanter, oculomotor, radius etc). I've been misinformed, your average 8 year old has a vocabulary of 10,000 words. With Dau.II the thought is the deed and she located and took the quiz even as I was holding forth a few days ago. Not being an avid reader like his sister Dau.I, or old like me, she didn't make it to 40,000 words but was still above the median.

The words that I booted include adumbrate, opsimath, clerisy, deracinate, epigone, cenacle and cantle. Some of which I must have known at some time in my life but have now decluttered from my head and all of which we can all manage quite well without. Indeed I guess I am an opsimath, without knowing it, so today I feel a bit like the chap in Moliere who is surprised and delighted to find that he's been speaking 'prose' all his life. The difficult words, including those in the list above, are estimated to occur less than 3 times in 1 million words.

My pal and contributor-of-comments Russianside mentioned a few days ago that it was the first time he'd written the word ooze. With The Wean having just turned 2, her parents were asking how on earth their nipper had learnt <such a word> (I forget the word that made them comment). They'd never used it in her presence, the child can't read yet and they couldn't imagine anyone in the creche using such a term. I remember a similar event when Dau.I was just able to walk: she pointed to the roof and said "chimney". Of all the words in the world to start cluttering her head with, that wouldn't have occurred to me and we adults were all certain that it hadn't come up in conversation. A key predictor of a large vocabulary is, predictably, that a child reads "a lot". The American/Brazilian team is looking for more test-takers, especially from teenagers and others whose vocabular is like to be increasingly rapidly, so ask your teen to give it a go. More data are good data.

I can't finish this discussion without mentioning Zipf's law which says that "the frequency of any word is inversely proportional to its rank in the frequency table". Thus, the most common word in any language occurs twice as often as the second most common, 3x more than the third, 4x than the fourth. Bizarre but empirically true. Zipf's Law applies in a wide range of other circumstances: the ranking sof urban populations, income distribution, product sales in shops, youtube or blog pageviews. Here's some data, showing that English text consists of a small number of very common words and a large number of uncommon ones - the latter is the Long Tail which is opening up the economics of publishing (Danny Battle can make some sales as well as Harry Potter, if enough readers [Amazon, we thank you] are given the choice). If you actually look at the words which make up the dataset below, you can have a punt about which long 19th century novel supplied the 200+,000 words.

The Blob has accumulated about the same number but on a wider diversity of topics, so it's not so surprising that adumbrate deracinates cantle.

3 comments:

Russianside10 January 2014 at 08:40
Thanks to a wonderful pressie over Christmas titled1,339 QI Facts my own vocabulary is growing for example Omphalodium = belly button (or tummy button to quote the author) and then two other related nuggets - 1 Michelangelo was called a heretic for giving Adam a bellybutton on the Sistine Chapel painting, and 2, just for the scientist, the human bellybuttoncontains 67 different species of bacteria...who knew
ReplyDelete
Replies

Add comment