Word Classification Test Histogram and Standard Deviation


 

 

     My baseline scores were 194 on the Word Classification Test and a 51.4 on the 600-word vocabulary test. My e-mail address is Robert N. Seitz <rnseitz@netzero.net>.
     However, I wasn't content to stop there. No, indeedy.
     As a physicist, I’ve never been a word buff—never worked crossword puzzles or looked up words in the dictionary. For years, my wife and I didn’t own a dictionary. I’d been taught that vocabulary has the highest correlation (0.8) with IQ of any mental measure. Since IQ can’t be significantly enhanced, I had assumed that vocabulary couldn’t be significantly enhanced, either. But then it occurred to me that I‘ve learned a lot of new words since I was a teenager. Most people probably frequently encounter words that challenge them. Many people try to expand their vocabularies. I decided to see what would happen if I tried it, too. I got the idea of trying to learn all the words in the Webster's Collegiate Dictionary. Granted: a dictionary might not offer the best selection of useful words. You might, perchance, have shuffled off this mortal coil and never reckoned yourself the poorer for having failed to know that a “basilium” is a “basiliomycetum cell” or that “emmet” is an archaic word for “ant”. And dictionaries tend to be heavily weighted with botanical and zoological terms.  However, it was the best list I could find at the time.
    It took about four weeks to type 10,000 single or multiple definitions into the computer, plus another week to add comments and to check the material for errors. (Some of the 10,000 definitions were of words that I knew, but felt that I didn't know sufficiently well.) After that, it’s hard to say, since the actual memorization was largely done in stolen moments over a period of several months. During that time, I added some words to my vocabulary. So how well did it work? Well, folks, the jury is still out. I'm finding that I'm forgetting some of these words if I don't look at them periodically. Mnemonics work best e. g., "The carcajou chased the kinkajou up the acajou." or "I harled a henequen hawser into the hawse." The question is, will virtually all of these words click into place after a few more bimestrial reviews? If not, it would seem to me to support the contention that IQ and vocabulary are closely correlated. If so, it might call that contention into question. While, had I been a logophile, I might have expanded my vocabulary, I probably wouldn't have expanded it by 10,000 words.



[8-1-2000 Update: I have just gone back to review these words after, perhaps, a year away from them, and have found that, although I knew most of them thoroughly a year and a half ago, I've forgotten many of them (although they come back very quickly). On the other hand, I just turned 71, so that may be playing a role in my retention limitations. However, I have permanently learned several thousand of them, and if I review them periodically, I may eventually remember them all. Also, learning by re-reading these lists may not be the best way to engrail these words in stone. Some words that are similar, like babasu, babesia, and babirusa can be difficult to keep straight.]


     I have set up a web page with this 10,100-word Thesaurus/Dictionary  available on it for downloading. Words in the Thesaurus are hyperlinked to words in the Dictionary, so that you can check on their exact definitions after finding them in the Thesaurus. The Thesaurus/Dictionary exists in two versions: a Word 97 format and an HTML format. The hyperlinks work in the Word 97 Thesaurus/Dictionary but they don't work in the HTML rendition. In the meantime, I'll be glad to email copies of this Dictionary/Thesaurus to anyone who wants them.

Extreme Non-Linearity of the Two Tests in Their Upper Registers
      Both tests seem to become very non-linear at their upper ends. Although my dictionaries fail to list the number of entries they contain, an entry count for the Webster's Collegiate and Random House abridged dictionaries indicates that both dictionaries contain about 68,000 entries. That means that my initial score of 194 on the Word Classification Test and 51.4 on the 600-Word Vocabulary Test would correspond to a total vocabulary of about 58,000 dictionary-entries. Expanding my vocabulary by another 10,000 words to all 68,000 dictionary-entries would only have raised those numbers to 198 on the Word Classification Test and to 57.8 on the 600-Word Vocabulary Test (assuming that "quodrat" should really be "quadrat"). The remaining words on both tests are not to be found in an abridged dictionary. Adding another 15,000 words to my internal lexicon from the Random House Unabridged Dictionary would have afforded me a perfect score of 200 on the Word Classification Test and a 58.9 on the 600-Word Test. Only by annexing another 7,500 words culled from my huge, 1930's Merriam-Webster's Second International Unabridged Dictionary could I have gotten all the words on both tests (except for the word "filemot", which is found in, and perhaps only in, the 20-volume Oxford English Dictionary).
      This experiment suggests some interesting implications. Preliminary estimates indicate that I would only have to learn about 22,500 additional words (15,000 + 7,500) to know all the relevant words in the Webster's International Unabridged Dictionary. (By "relevant words", I mean words other than additional names of rocks, birds, animals, and botanical terms. I figure I've learned enough of them already in the American Heritage Abridged Dictionary.) Only then would I be able to get a perfect score (59.9) on the 600-word vocabulary test (assuming that "sandiver" is the correct spelling of "sandever"). Twenty-two thousand, five-hundred words are a lot of extra words to learn, but knowing them all would give you a virtually total command of the English language.
      By contrast, Shakespeare used only 29,066 different words (including proper names) in all his works, of which about 18,000 are root words or lemmas. (However, his vocabulary may have been more exotic than it sounds. There may be many common words like "basin", "wallet", or "tatting" that Shakespeare wouldn't have used.)
      I am publishing these "10,100 most-difficult words in an abridged dictionary" on my web page for easy download. These words and their definitions require less than 2 megabytes of disk space and don't take long to download. They are available as a Word 97 file or in html format.

Scaling of Total Vocabulary with IQ
      One of the interesting implications of these musings is that the average man/woman-in-the-street would probably know 25,000 to 35,000 dictionary entries. (Note that the dictionary includes additional words like "here", "go", "schoolteacher", and "Africa" that wouldn't be found on a vocabulary test).
  Opening the dictionary to a set of randomly pre-selected pages and testing subjects with IQ's in the neighborhood of 100 might establish a baseline measure of absolute vocabulary. I think that this is a test that cries out to be conducted. I think it would be important to know how various capabilities, including vocabulary, scale with IQ. It would be very easy to assemble a cohort of individuals, open an abridged dictionary to several selected pages, and count the number of words whose definitions they know. (Unlike a vocabulary test, the dictionary has all the words.) I've speculated that total vocabulary might be roughly proportional to the ratio IQ.
    I have prepared a Word Classification Test Histogram depicting the Word Classification Test scoring frequencies that fall within each five-point interval. I have also estimated a value for the standard-deviation-on-the-right, using the top 100 scores, plus 30 of the older scores below the current top 100. This has necessarily required assuming a value for the mean. Although scores on this test might well be skewed to the right, scores expressed as standard deviations are relatively insensitive to variations in the assumed value for the mean. The reason is that as the assumed mean score decreases, the standard deviation increases, tending to stabilize the sigma-displacement values. Accordingly, I have assumed a mean equal to the median (now = 170). Table I shows these calculations and the resulting value for the standard deviation (~11).
    My personal guesstimate regarding the meanings of these scores would be that Fredrik Berchtold's perfect score of 200 on the Word Classification Test would fall above the 1 in 100,000 level, and considering the non-linearity of the test in relation to its last few questions, could lie above the 1 in 1,000,000 threshold. There is reason to believe that a score of 165 on the Word Classification Test corresponds to a vocabulary at, or somewhat above, the 1 in 1,000 level.