Armed with a new list of words and using the power of social media, a new study published in Frontiers in Psychology, has found that by the age of twenty, a native English speaking American knows 42 thousand dictionary words.
“Our research got a huge push when a television station in the Netherlands asked us to organize a nation-wide study on vocabulary knowledge,” states Professor Marc Brysbaert of Ghent University in Belgium and leader of this study. “The test we developed was featured on TV and, in the first weekend, over 300 thousand Dutch speakers had done it – it really went viral.”
Realising how interested people are in finding out their vocabulary size, the team then made similar tests in English and Spanish. The English test has now been taken by almost one million people. It takes up to four minutes to complete and has been shared widely on Facebook and Twitter, giving the team access to an unprecedented amount of data.
“At the Centre of Reading Research we are investigating what determines the ease with which words are recognized;” explained Professor Brysbaert. The test includes a list of 62,000 words that he and his team have compiled.
He added: “As we made the list ourselves and have not used a commercially available dictionary list with copyright restrictions, it can be made available to everyone, and all researchers can access it.”
The test is simple. You are asked if the word on the screen is, or is not, an existing word in English. In each test, there are 70 words, and 30 letter sequences that look like words but are not actually existing words.
The test will also ask you for some personal information such as your age, gender, education level and native language. This has enabled the team to discover that the average twenty-year-old native English speaking American knows 42 thousand dictionary words. As we get older, we learn one new word every two days, which means that by the age of 60, we know an additional 6000 words.
“As a researcher, I am most interested in what this data can tell us about word prevalence, i.e. how well each word is known in a language;” added Professor Brysbaert.
“In Dutch, we have seen that this explains a lot about word processing times. People respond much faster to words known by all people than to words known by 95% of the population, even if the words used with the same frequency. We are convinced that word prevalence will become an important variable in word recognition research.”
With data from about 200 thousand people who speak English as a second language, the team can also start to look at how well these people know certain words, which could have implications for language education.
This is the largest study of its kind ever attempted. Professor Brysbaert has plans to improve the accuracy of the test and extend the list to include over 75,000 words.
“This work is part of the big data movement in research, where big datasets are collected to be mined;” he concluded.
“It also gives us a snapshot of English word knowledge at the beginning of the 21st century. I can imagine future language researchers will be interested in this database to see how English has evolved over 100 years, 1000 years and maybe even longer”.