Are IQ Distributions LnNormal Rather Than Gaussian?
IQ Distributions aren't Gaussian.Distributions!
I'd like to offer profound thanks to Steve Coy and Patrick
Wahl for correcting what I've said about nonGaussian distributions failing to
possess standard deviations. I'm delighted to get letters
correcting or questioning what I've said on this website. Please feel free to
comment.
. .
Discussions of IQ distributions usually show a bellshaped curve and
explain that IQs are distributed in accordance with a Gaussian or normal
distribution. In fact, they're not! As Dr. Hans Eysenck observes
in his book, "Genius",
p. 3839
"(Ever since Quatelet fell in love with it, and used it to
define the 'average man' (Stigler, 1986), the Gaussian curve has exerted a
fatal influence on social scientists. As Gabriel Lipman told Poincaré, à
propos the curve: 'les experimentateurs s`imaginent que c'est un théorème de
mathématique, et et les mathématiciens d'être un fait experimental'.
('Experimentalists think that it is a mathematical theorem, while
mathematicians believe it to be an experimental effect.') Galton's arrangement
may serve well enough as a picture of the distribution of intelligence, but
certainly not of eminence.
"(Even as far as intelligence is concerned, Burt (1963) has suggested
that a Pearson Type IV curve fits the case better than the normal curve!)
Micceri (1993) has analysed a large sample of curves of distribution that
would normally be expected to be "normal", finding practically all of them
to deviate markedly from statistical normality. As Geary (1947) said long
ago:" . . . normality is a myth; there never was, and never will be a normal
distribution' (p. 241)."*
Steve Coy writes,
"It
is a well known fact (unfortunately I don't have a reference for you) in
realworld applied statistics that you virtually never encounter any quantity
with anything very closely approaching a normal distribution *in the wings*. The
wings are virtually always significantly higher than what the normal
distribution predicts, more and more so the further you go out. This is because
(a) the normal distibution is in a sense "as compact as it can get",
for a given standard deviation. At the opposite end of the spectrum, the
Chebyshev inequality tells us the maximum fraction of a distribution that can
lie beyond n standard deviations from the mean. I'm virtually certain that good
studies have been done of this, but I'll bet that if you took a random sample of
published real world measure data *of any kind*, no matter how normalish the
distributions seem at first glance, if you look closely at the tails, they won't
be."
The distribution of heights is nonGaussian. There are
many more extremely tall and extremely short men than would be predicted by a Gaussian curve.
And as for IQ:
For openers, Gaussian distributions run from ( ∞) to ∞.
IQs, on the other hand, start at 0, peak at 100, and extend to
infinity, falling off more slowly than a Gaussian distribution
as one moves out along the righthand wing of the IQ curve.
Lognormal distributions start at 0, peak at 1, and extend to
infinity, falling off more slowly than a Gaussian distribution
as one moves out along the righthand wing of the IQ curve.
For example, If we used the Gaussian curve as our guide, we would expect
to find an IQ of 160 or above among about
one in 31,600 children. In fact, we find an IQ of 160 or above occurring
in one in about every 1,125 kids (Please Table 1, below.).
If we used the Gaussian curve as
our guide, we would expect to find an IQ of 200 or above among one in 78,000,000,000
children. In practice, we seem to find an IQ of 200 or above in one in about
every 500,000 kids. And as for Marilyn vos Savant's childhood IQ of 228,
don't even ask! (The odds are smaller than one in a quadrillion when you hit
218!)
These are no minor discrepancies. A Gaussian curve underpredicts the
likelihood of finding someone with an IQ of 200 by the order of about 150,000
to 1!
Obviously, Gaussian distributions don't work well for IQs that are very
far away from the population average.
This raises an interesting point. If most measured quantities
are nonGaussian out on their wings, this could explain the deviations of an IQ
curve from a Gaussian profile. Further, if we fit the logarithms of IQs with a
Gaussian distribution, the distribution of the lagarithms would presumably
not form a perfect Gaussian, either. On the other hand, there seems to be a
close fit between a lnnormal distribution and the IQ curves that I've
investigated.
What actually happens if you try to use a standard
deviation is that the standard deviation stretches as we move to the right along
the curve.
An IQ of 116:
For example, if you go out along the IQ curve to the point
where 84.315% of the population lies below it (one standard deviation on a
Gaussian curve), you'll find yourself at IQ =
116.2. The spread, in IQ points, between the population average of 100 and the
point on the curve where 84.135% of the population lies below it is thus 16.2
points of IQ. So if we try to apply a standard deviation to it, the first
"standard deviation" would be 16.2 points
I'm wondering if this might not be why early researchers came
up with a standard deviation of 16 for ratio IQs. It would have allowed them to
fit their data in the neighborhood of the mean. Discrepancies farther out would
have been more difficult to prove in the early days because the number of people
getting higher scores falls off as the IQ rises..
An IQ of 135:
If you go farther along the IQ curve to the point where
97.725% of the population lies below it (two standard deviations on a Gaussian
curve)which this isn't), you'll find you're at an IQ of 134.986, or about 135.
The difference between 116.2 and 135.0 is 18.8 points of IQ, so we might say
that our second standard deviation is 18.8 points of IQ,.
An IQ of 157
If you continue along the curve to the IQ where 99.865% of
the population lies below it (three standard deviations on a Gaussian curve),
you'll be at an IQ of 156.831, or about 156.8. This is 21.8 points above 135, so
we might say that our third standard deviation is 21.8 points of IQ.
An IQ of 182:
If you wander farther down the curve to the celebrated
99.997% required for entry into the foursigma societies, you'll find yourself
at an IQ of 182.2, and the equivalent of four standard deviations on a Gaussian
curve. And the difference between 182.2 and 156.8 is 25.4 IQ points, so we might
assign 25.4 points to our fourth (ersatz) "standard deviation".
An IQ of 212:
Far down the curve comes the 1in3,500,000 point at which
99.99997% lies below your lofty perch, at an IQ of 211.7. (You're in
rarified company here.) Since 212 is 29.8 points above 182.2, we can say that
our fifth (bogus) standard deviation stretches over 29.5 points of IQ.
An IQ of 246:
Getting about as far along the IQ curve as mere mortals can
go, we arrive at the oneinabillion level of IQ = 246.0. This sixth fictitious
"standard deviation" would be about 34.3 points, or a little more than
twice the value of the standard deviation in the vicinity of the mean.
Would a
LnNormal Curve Fit IQs Better Than a Gaussian Curve?
Two years ago, Dr. Robert Dick suggested to a UniversityofKentucky
math major, John Scoville, that he try plotting the natural logarithms of
the ratios of mental ages to chronological ages to see if that would generate
numbers that were closer to a Gaussian distribution. (The IQ is simply the ratio of the mental
age to the chronological age multiplied by 100.) For example,
for a ratio of mental age to chronological age of 1.4, corresponding to an
IQ of 140, the natural logarithm is .0.3365. If you multiply that natural
logarithm by 100 and add 100 to it, it looks awfully much like an IQ of 133.65.
And, in fact, it's what psychometrists are calling the deviation "IQ"
these days. The question was: how well would these logarithmic numbers fit a Gaussian
distribution?
The short answer is, for all intents and purposes: perfectly! In 1951, Geoffrey Thomas Sare
submitted an M. A. thesis entitled, "The Complexity of Gestalt as a Factor
in Mental Testing: a Contribution to the Theory of Test Construction"
. In his thesis, Mr. Sare included a table matching IQs with their observed
frequencies of occurrence . . empirical data gathered over the years. When
John compared his lognormal calculations to Mr. Sare's data, he found a
virtually perfect match. For example, for an IQ of 200, a lnnormal distribution
predicts a rarity of 1 in 521,000 children. Mr. Sare's table shows a rarity
of 1 in 532,000. Compared to 1 in 78,000,000,000, that's pretty good! Even
if the lnniormal calculations differed by a factor of 2 from the observed
frequencies, that would still be enormously better than a Gaussian fit, and
would be within range of experimental uncertainties.
To read the frequency of occurrence that corresponds to a
given mental age/chronological age, one needs to take its natural logarithm and
then divide this by the standard deviation of the (Gaussian) distribution of
the natural logarithms of IQs. This standard deviation needs to be determined
experimentally. This is a very important step, because what we're seeking is a
natural psychological constant, akin to the Rydberg Constant or, perhaps, the
speed of sound under standard conditions.
Using the Sare data, Mr. Scoville arrived at a value for s
ranging between 0.1493 to 0.1501, corresponding to a standard deviation for the
deviation "IQ" of about 15 points of deviation "IQ". (David
Wechsler wrought well when he defined a standard deviation of 15 points for his
Wechsler Intelligence Scales.) You'll notice that I said the deviation
IQ because, if this lnnormal hypothesis is correct, there can be only one
distribution of deviation "IQs" for a large, unselected population (Gaussian),
and one standard deviation (15) for it. (This is not to say that we can't
readily construct a sample population with IQs whose natural logarithms are not
randomly distributed. The usual sampling precautions would have to be exercised
to ensure that a sample population were representative of he larger population
for which norms are to be established.)
John Scoville published his results on the Internet in a paper, entitled,
"
Statistical Distribution of Childhood IQ Scores
". I was intrigued with his results.
I had wondered how anyone could explain childhood IQ's of 200, or Marilyn
vos Savant's childhood IQ of 228 when such scores were so absurd if IQ's
followed a Gaussian distribution. I first attempted to apply them to the
IQ distributions that Terman, et al, found in their screening of more than
250,000 California schoolchildren in 19211922.
What I found was agreement at and above an IQ of 165 among the Sare data, Mr. Scoville's lnnormal
distribution, and the Terman data.
Over the IQ range 150 to 164, Dr. Terman found about 50% of
the children with IQ's between 150 and 164 that a lnnormal distribution would
predict... but bear in mind that Dr. Terman only found about 30% 0f the number
of children with IQs of 140 or above that a lnnormal distribution would
predict, and only about 60% as many as a Gaussian normal distribution would
predict.
Over the IQ range from 140149, Dr. Terman only included
about 22% of the children that a lnnormal distribution would predict, and only
about 44% as many children as he would have been expecting to find, based upon a
Gaussian distribution.
Table 1, below, shows how Dr. Terman's
distribution compared with the Gaussian distribution that he would have been
expecting. The second column shows what a Gaussian normal distribution would
have led him to expect, while the third column shows what he actually
found. Look at the differences! It's obvious at a glance that his numbers don't agree remotely with
the Gaussian numbers he would have been anticipating. Furthermore, the number in
the 140144 range should have been at least twice that of the 150154 range.
Instead, it's only about 12% larger. (You
wonder what went through the heads of Lewis Terman and his staff when they
realized that their results were differing so dramatically from what they must
have expected... like, "What's the scientific community going to say about
this strange distribution?" and "How are our sponsors going to greet
these wild anomalies?")
The fourth column shows the numbers that a Gaussian
distribution would have predicted if the total number of cases were restricted
to the 621 subjects found in Dr. Terman's Main Group. It's included primarily to show the shape of an equivalent Gaussian
distribution compared to the Terman distribution.
There have been critiques of the Terman results, but so far I
haven't seen anything that mentions the arithmetic anomalies that you're seeing
here. .
Table 1.
Frequencies
of IQs Found by the Terman Study Screening, vs. Frequencies Predicted by a
Gaussian Distribution

Gaussian Prediction*


Alt.Gaussian Prediction** 
140144

637

160

377 
145149

264

150

156 
150154

100

134

59 
155159

34

64

20 
160164

11

43

6 
165169

3

27

2 
170174

0.77

20

0.46 
175179

0.18

8

0.11 
180184

0.04

10

0.02 
185189

0.0085

2 
0.005 
190194

0.00125

2

0.00074 
195199

0.00022

0

0.00013 
200

0.00003

1

0.00002 
TOTALS: 
1052 
621 
621 
*  This
column shows the distribution of IQs that Dr. Terman must have expected to find,
based upon a presumed Gaussian distribution of IQs in his subject population.
(I have
used a standard deviation of 16 in calculating the Gaussian predictions because
that's what I think Dr. Terman would have employed.)
**
 This column shows a Gaussian distribution with the same number of subjects
as Dr. Terman "Main group", the intent being to compare the shape of a
Gaussian curve with the shape of Dr. Terman's distribution.
In the IQ range from 140 to 144, the Terman screening found
only about 1/4th as many children as a Gaussian distrtibution would have
predictedand this is in the IQ range where the agreement with a Gaussian
prediction should have been closest!
In the IQ range from 145 to 149, the Terman screening found
only about 57% as many children as a Gaussian distribution would have
predicted.
In the IQ range from 150 to 154, the Terman screening found
about 134% as many children as a Gaussian distribution would have predicted. By
now, the Terman population has switched from fewer children in each IQ category
than a Gaussian distribution would have predicted, to more and more children in
each IQ range than a Gaussian distribution would have predicted, with a
crossover in the upper 140's.
Above this crossover IQ, the frequencies predicted by a Gaussian
distribution fall off enormously more rapidly than the frequencies observed in
the Terman screening.
One of the explanations that has been given for these
anomalously high frequencies of high childhood IQs is that a few children
develop mentally earlier than average, with some of them going through mental
growth spurts, and that these children might account for the phenomenally high
scores found among a few children. The mental growth rates of these children
would later slow down, while otherse. g., late bloomerswould partially
catch up with them.
This seems plausible. Physical growth spurts occur among
children, and some children reach puberty earlier than others. The only
dissonance here is the
Observation #2:
About the
Terman Study
Because of resource limitations, Dr. Terman's identification of his
"termites" was a seatofthepants operation. Fishing mostly in the cities
around Stanford University, Dr. Terman found about 1 child in 262 with an
IQ of 140 or above in his "main group", where he would have expected (assuming
the distribution of IQs to be Gaussian or "normal") to find about 1 child
in every 161 (using his standard deviation of 16) with an IQ of 140 or above.
However, as we know today, about 1 child in every 80 has a ratio IQ of 140.
Out of an estimated 168,000 schoolchildren, Dr. Terman should have found about
2,100 children with IQs above 140. Instead, he found only 621, or about 30%
of the expected number. In addition, there were the following shortcuts taken in
the screening and IQ certification process.
(1) He relied on teachers to identify potential candidates for his 643child
"main group". This led to a preselection of 6% to 8% of the total school
population in the cities of the San Francisco Bay Area.
(2) This first step was followed by the administration of the National Intelligence
Test (a printed group test).
(3) Those children who scored at the 95th percentile or above on the National
Intelligence Test were then given an abbreviated form of the
1916 StanfordBinet test created for this purpose.
(4) The IQs of the children who scored highest and/or were the oldest were
corrected by Dr. Terman in an attempt to sidestep ceiling effects.
In addition to these artifacts of the selection process, there's also
the possibility that Dr. Terman's "Termites", drawn as they were from Berkeley,
San Francisco and the Bay Area, may have been a somewhat enriched sample.
Thirtyfive percent of the children's parents were listed as "professionals".
Taken allinall, there were generous opportunities for errors to creep
in, and Dr. Terman's data suggests that
error did creep in
.
The table below compares the distribution of IQs in Dr. Terman's "main
group" with the distribution of IQs that the Sare tables would predict.

Wechsler "IQ" Range 
Gaussian Prediction*




140144

133136 
637

1,000

160

16%

145149

137140 
264

525

150

29%

150154

141143 
100

281

134

48%

155159

144146 
34

151

64

42%

160164

147149 
11

82

43

54% 
165169

150152 
3

34

27

79%

170174

153155 
0.77

21

20

95%

175179

156158 
0.18

8

8

100%

180184

159161 
0.04

4

10

250%

185189

162164 
0.0085

2

2 
100%

190194

164166 
0.00125

1

2

200% 
195199

167169 
0.00022

0.664

0

0% 
200+

169+ 
0.00003

0.336 
1

300% 
*  This
column shows the distribution of IQs that Dr. Terman must have expected to find,
based upon a presumed Gaussian distribution of IQs in his subject population.
I have
used a standard deviation of 16 in calculating the Gaussian predictions because
that's what I think Dr. Terman would have employed.
In the IQ 140144 range, he only enrolled
about 16% of the candidates that should have been picked up by the screening.
In the IQ 145149 range, he enrolled about 29% of the candidates that
should have been picked up by the screening.
In the IQ 150154 range, he enrolled about 48% of the potential candidates.
In the IQ 155159 range, he enrolled only about 42% of the available
children.
In the IQ 160164 range, he got about 52% of what should have been out
there...
In the IQ 165169 range, he enrolled about 79% of the potential candidates.
In the IQ 170174 range, he enrolled about 95% of the potential candidates.
Above IQ 175, he got them all.
So he got
 only 25% of the available
children in the 140154 range,
 about 50% of the children
in the IQ 155164 range, and
 virtually 100% of the children
above IQ 164.
The IQ range from 140 to
154 had more than 70% of the Termites in it. But . . .
Only 25% of the children with IQ's below 155 who could have been in
the study were included in it, but virtually 100% of the children in the 165andup
range were admitted to the study.
Unwittingly or otherwise, Dr. Terman stacked the deck.
I fantasize that, faced with funding and staffing limitations, he cut
back the number of children in the lower IQ registers. (However, it could also
be that, in the teachers' selection, the brightest children stood out like red
beacons, whereas the brilliant, but not mostbrilliant, children weren't as
obvious.) His distribution seriously
underrepresents the numbers of children in the two bottom echelons of his
study. (That may be how he missed the two Nobel Prize winners who were among
the children he screened and rejected: Dr. William Shockley and Dr. Luis
Alvarez.)
He came up with a few more children (43 versus 37) in the IQ 170 to 200 range than
the Sare data would predict, but that may have been because
of statistical fluctuations, or because of his "correcting" (by as much as
14 points) high scores for ceiling effects, or because his sample included
the University of California at Berkeley and the San Francisco Bay cities.
What is crucial about these results are the close fits at the upper
end of the IQ spectrum. This is the territory in which a Gaussian distribution
goes completely bananas when it comes to fitting the observed IQ frequencies.
# 
1st word 
S/A 
2nd word 
2.  decadence  decline  
7.  if  although  
11  abjure  renounce  
24.  peculation  embezzlement  
33.  insouciant  nonchalant  
50.  choleric  apathetic  
58.  truculent  violent  
65.  cenobite  anchorite  
71.  ambiguous  equivocal  
80.  devisor  assignor  
86.  diatribe  invective  
89.  viscosity  viscidity  
98.  encomium  eulogy  
103.  sophistry  casuistry  
116.  abstruse  recondite 
....Ten sample analogy questions from the CMT
# 
Statement of the Analogy 
Choices (circle one) 
7.  A : C :: X :  1. Y 2. V 3. Z 
9.  Darwin : Evolution :: Einstein :  1. Relativity 2. Mathematics 3. Magnetism 
14.  Square : Cube :: Circle :  1. Sphere 2. Line 3. Round 
22.  July 4 : United States :: July 14 :  1. England 2. Spain 3. France 
38.  Mercury : Venus :: Earth  1. Mars 2. Jupiter 3. Saturn 
44.  Socrates : Plato :: Samuel Johnson ::  1. Swift 2. Pope 3. Boswell 
47.  Analysis : Synthesis :: Differentiation :  1. Integration 2. Frustration 3. Abomination 
58.  Astrology : Astronomy :: Alchemy ::  1. Physics 2. Chemistry 3. Phrenology 
63.  Gascon : France :: Walloon :  1. Netherlands 2. Transvaal 3. Belgium 
70.  Eight : Two :: Thousand :  1. Twentyfive 2. Twenty 3. Ten 
....This test was validated using 81 Wilson
College freshmenandwomen (I won't say fresh women  grin), 135 Stanford
Sophomores, and 96 Stanford seniors. The mean score made by the 954 Termites who
took this test in 1940 was 96 out of 190.
....Quinn McNemar analyzed the data and concluded that the mean for the CMT if
given to the general population (in 1940) would have been about 2 correct
answers, with a standard deviation of 45. That means that a raw score of 92
would have corresponded to a 2sigma level among the general population.
Consequently, he Termites' average score of 96 would represent an IQ of about
2.1 sigma above the mean, or using their standard deviation of 16, it would have
corresponded to a deviation IQ of about 134. This is to be contrasted with their
childhood average ratio IQ of 152. However, since that time, others have
analyzed the results and have arrived at an average IQ 2.5 sigma above the mean,
and this seems to be the currently accepted value. This corresponds to a
deviation IQ of 140 on
a sigma = 16 scale, or about 146 on a ratio IQ scale... 6 points below their
childhood average. Since the Termites' average score was 196, and the average
for the general population was 2, this corresponds to a standard deviation of
94/2.5 or 37.6.
.....Two of the male Termites on the CMTA tied for the high score of 172. I'm
going to assume
(1) that the Terman screening netted all of the brightest children, and
(2) that the Termites who took the CMTA included the adult editions of the very
brightest children.
Since there were 260,000 children in the original population,
I would expect to find, by definition, one child with a deviation IQ
(standard deviation = 16) of 171 (one in 220,000), or two children with deviation IQs of
169 (one in 120,000). And
I would expect to find those same children among the group who took the CMTA in
1940. I would expect to find in a population of 260,000, by definition(!),
roughly one adult with a deviation IQ of 171, or roughly two adults with
deviation IQs of 169. (That's true because deviation IQs are defined solely in
terms of their frequencies of occurrence.) So that means that I can equate a raw
score of 170 on the CMTA to a zscore of 4.3725, corresponding to a
standard deviation of 170/4.3725 = 38.9. This is 1.3 points of raw score
different from the 37.6point standard deviation that we obtained above for the
Termite mean score. However, the Termites mean of about 2.5 standard deviations
from the mean is only approximate. If I use 38.9 for the standard deviation, I
get a mean IQ for the Termites of 2.474 sigma, or a deviation IQ of about 139.6.
This would also set the ceiling of the CMTA at approximately
4.83 standard deviations above the population mean, or about 177.
It's worth noting that this value for the standard deviation
is derived independently of the methodology used to estimate the mean and
standard deviation for the average IQ of the Termites, and that it agrees quite
closely with the results obtained by Terman, et al, for the mean IQ of the
Termites. Also,
.....It might be worth noting that this test wasn't wellvalidated, at least in
1940.


(To be continued)