Saturday, June 22, 2013

Intelligence and the Flynn Effect, One More Time

[Edit 02/11/2017: The final version of this essay can be found here.]

Imagine an experiment that proceeds in the following fashion. There is a stage, and onto that stage researchers place boxes that are fairly nondescript other than that each has a small hole near the bottom. Each box is left on the stage for precisely an hour, during which time a small quantity of water drains from the hole and is collected and measured.

After taking a broad sampling of boxes it is discovered that there is some variation in the results — some boxes produce more water, some less. The distribution comes out nearly normal, with a mean around 500 ml collected in an hour and a standard deviation around 100 ml. To provide some visualization for these results, the researchers place just three boxes on the stage: a left box producing 400 ml in an hour, one standard deviation below the mean; a middle box producing 500 ml in an hour, right at the mean; and a right box producing 600 ml in an hour, one standard deviation above the mean.

The study of these boxes is of some importance to the researchers because the boxes serve useful purposes in the community. For instance, among other valuable features, food placed on top of a box does not spoil as quickly as it would otherwise. Furthermore, other experiments have shown that a box's usefulness tends to be directly proportional to the box's water production score, and although this correlation is not exact it is statistically significant and it emerges in all kinds of usefulness experiments. Based upon this strong correlation, the researchers define a box's usefulness as equivalent to its water production ability.

Curious about the variation in scores and wanting to learn more about the cause of water production ability, the researchers conduct experiments that focus on the physical/generational characteristics of each box. Some of these characteristics, such as surface material and factory of production, emerge as helpful candidates, because variations in these characteristics correlate with variations in water production scores. Again, the correlation is not exact but it is statistically significant and it allows the researchers to predict water production scores with a reasonable degree of accuracy given any box's overall characteristics. These results are strong enough to convince the researchers that a box's water production score and its physical/generational characteristics are tightly linked. The researchers begin to form theories about the nature of this linkage.

These experiments are frequently repeated, at least once each year, and the researchers note that the results remain extremely consistent — same variation, same distribution, same correlations. All the prior years' findings are regularly verified, no theory gets overturned.

And yet over time, there does arise one nagging problem.

Despite the fact that almost every feature of the experiment remains exactly the same — same set up, same variation in results, same distribution, same usefulness, same correlations, same everything — despite all this remarkable consistency, the water production scores keep going up. They keep going up every year, and they keep going up for all the boxes. The amount of increase is not overwhelming but it is large enough that it cannot be ignored. For instance, the collection containers that were used in the early years of the experiment eventually have to be replaced with bigger containers to prevent spillage. After ten years of these experiments, when the researchers place the three representative boxes onto the stage to help visualize the results, the left box now produces 480 ml of water in an hour, the middle box produces 600 ml, and the right box produces 720 ml. The researchers realize that an average box now has the same water production ability as did a box one standard deviation above the mean from just ten years prior.

Many explanations for this phenomenon are offered and investigated, beginning with a focus on the boxes themselves. Has there been a change in the surface materials? The researchers discover that for a small number of boxes some slight alterations have indeed been introduced. Has there been a change in the construction process? Again the researchers find that one factory has been mothballed and another remodeled, although the majority of production facilities remain as before. The researchers look for still other clues, such as changes in size or weight, and although particular instances can be found, such changes are not pervasive. Indeed, that becomes the telltale defect against all these explanations — each accounts for only a small number of cases at best, and appears inadequate in the face of the water production increase across boxes and across time.

The researchers then focus on the boxes' environment, thinking that this line of attack might uncover a more wide-ranging solution. For instance, a few stages where experiments are conducted now sit higher than they used to. Other stages are made out of metal whereas they were formerly made out of wood. And some stages have had a ventilation system installed. But here too, such circumstances account for only a limited number of cases, and furthermore, environmental explanations suffer from a still more troubling defect, namely that no one can explain how an environmental change would translate into the necessary change in physical/generational characteristics for each box. Everyone agrees — as a consequence of the experiments conducted each year — that water production scores and the physical/generational characteristics of each box are tightly linked, so that any change in water production scores corresponds to a change in the boxes' characteristics. But how does an environmental change produce such an effect? How does a higher stage, a metal stage or a ventilated stage produce the requisite change in surface material or point of origin? It seems implausible.

One researcher attempts to solve this dilemma by demonstrating how an environmental change and a physical/generational change can feed off each other with an amplifying effect. He uses terms such as multiplier and feedback loop and produces an impressive array of mathematics. “For instance a small change in surface material or production quality can generate a subtle difference in air flow around the box,” he begins. “At the point of maximum air flow differential, a powerful feedback vibration is set up inside the box. Then the vibration amplifies the rearrangement of surface material, which impacts the air flow still more the following year, which causes ….”

Another researcher, noting the large number of hypotheses generated so far and each one's inability to account for all the cases, suggests that perhaps there is not just one explanation for the rising water production scores but that the solution is to be found in a combination of explanations. This approach seems promising to the researchers, although they have to agree it is not the kind of definitive answer they originally had in mind.

Then one day a visitor shows up and mentions something to the researchers. “I don't know about you, but it seems to me that it keeps getting warmer all the time — I've been coming to your experiments for several years now, and each time I visit, it feels hotter than it did the last time. Then it occurred to me, that would make for an elegant explanation to your rising water production scores.”

“How so?” ask the researchers.

“A general rise in temperature is a perfect match to what you're looking for. It has all the essential features. For one, the increase in temperature mirrors the increase in water production scores. Also, the increase in temperature is continuous over time — just like the rise of water production scores. And finally, the increase in temperature is ubiquitous, it affects all the boxes the same.”

The researchers are not convinced.

“Here are the shortcomings in your explanation,” they say to the visitor. “In the first place, your explanation isn't germane to the problem. We are investigating water production ability, as measured by water production scores and contained in the physical/generational characteristics of these boxes. It's hard to see how ambient temperature can even be relevant to that discussion. But assuming it were somehow relevant, your explanation has an even bigger defect: you can't provide a plausible description for how a change in environmental temperature will alter the physical/generational characteristics of each box. Are you trying to suggest that a slight increase in temperature will somehow rearrange a box's surface materials in a profound way, or somehow reset a box's factory of origin. That would be ridiculous. If a change in temperature were the cause of a change in water production scores, then that change in temperature must also impact the physical/generational characteristics of each box, because those characteristics are the source of a box's water production ability.”

The visitor thinks about this for awhile, then gives a lengthy reply:

“I believe you're working under a mistaken assumption. Listen, I agree with you that variations in a box's physical/generational characteristics produce corresponding changes in a box's water production score — you have plenty of experimental evidence for that, and the results are strong and compelling. But the results are so strong and compelling that they seem to have convinced you that the inference is valid also in the other direction — that is, that every change in water production score is necessarily accompanied by a change in a box's physical/generational characteristics. But actually, you have no evidence that the inference is valid in that direction, all your evidence runs the other way. Moreover, the increase in overall water production scores across time and across all boxes suggests quite strongly that such an inference would be wrong.

“Here's how I would describe the situation. We have three different quantities in play: a box's capacity to produce water, the water production score, and the ambient temperature. Let's let letters stand for these quantities:

C = a box's CAPACITY to produce water,
S = a box's water production SCORE, and
T = the ambient TEMPERATURE.

“A box's water production score (S) is the combination of the two other factors (C and T) working independently. This can be expressed in a simple relationship:

S = C x T.

“That relationship fits the experimental results to a tee. At any given point in time, the ambient temperature (T) will be constant, so when experiments are run at that time, all the variation in S will be the direct result of the differing physical/generational characteristics of each box, because those characteristics drive a box's capacity to produce water (C).

“But over time it's the orthogonal effect that holds sway. Over time, it is C that remains constant. Your investigations have already told you this, because when you went looking for physical/generational changes across time that would explain your results, you discovered that physical/generational changes across time are minimal at best, hardly worth the notice. But if C remains essentially constant, then all the increase in S over time is explained solely by an increase in T. The rising ambient temperature causes water production scores to increase over time and does so without impacting any box's physical/generational characteristics.”

“But it has to impact those characteristics,” the researchers insist. “A box's physical/generational characteristics embody its water production ability.”

“No, that's just it,” the visitor answers. “Those physical/generational characteristics tell only half the story at best. If you want to fully understand the nature of water production ability, you have to take into full consideration the ambient temperature too. And if it's the increase in water production ability you're trying to explain, then all the focus has to go towards the ambient temperature, because that's the only factor that changes over time.”

The researchers are polite but end by saying they cannot discuss the matter any further — they have combinations of explanations and impressive mathematical formulas to attend to.



I believe the above analogy forms a nearly exact isomorphism to the current situation regarding intelligence and the Flynn effect.

IQ scores are like water production scores, and individual people (or individual brains, if you will) are like boxes. Scientists have built up a large, compelling cache of evidence that variations in IQ scores among individuals are driven by a presumed set of neuronal/genetic characteristics, the idea being that if we had good knowledge of an individual's neuronal/genetic background, we could predict with a reasonable degree of accuracy that individual's IQ score and corresponding likelihood of success within the community. The correlations are not exact but they are strong enough to be informative across both individuals and groups of people.

If that is all there were to it, then intelligence would be essentially explained. However, that is not all there is to it. In addition to the experimental evidence outlined so far, scientists have also discovered that raw IQ scores keep going up over time, a phenomenon that has been dubbed the Flynn effect. The scores keep going up every year and they keep going up across all sorts of populations.

Lots of explanations have been offered. Many of them focus on presumed improvements to a human's neuronal/genetic underpinnings — through better nutrition for instance, or assortive mating. Other explanations focus on changes in environmental features, such as increasing visual stimulation or a wider access to education. None of these explanations however can account for the widespread impact of the Flynn effect, plus the environmental explanations suffer from the further perceived difficulty that they must be translated into more or less permanent changes in a person's neuronal/genetic characteristics, because everyone agrees those characteristics are what drive intelligence. This translation often seems implausible.

Dickens and Flynn (2001) have suggested the problem can be solved by demonstrating that neuronal/genetic characteristics and environmental factors resonate off each other with amplifying effect. They have developed multipliers, feedback loops and complex mathematical formulas that show how this mechanism can be tuned to experimental results. Other researchers, perhaps frustrated over the lack of a definitive solution, have suggested that the Flynn effect can be explained by no one factor alone, but that instead a combination of factors must be involved.

My contribution to the Flynn effect discussion is simple — I seem to have noticed something that apparently no one else has. I have noticed that the amount of non-biological pattern, structure and form contained within the human environment keeps going up over time. This increase is apparent from a survey of human history, from the human great leap forward through the agricultural revolution through the ancient civilizations of Mesopotamia, Egypt and Greece through the Renaissance and finally culminating in the explosion of technologies and constructions we live within today. When humans move through their environment, they navigate an increasingly complex maze of pattern, structure and form, and they navigate this maze at a faster and faster pace. Measuring the quantity of this maze would be a challenge, but however one would measure the amount of non-biological pattern, structure and form tangibly contained within the human environment, one would have to conclude it keeps going up year after year after year.

This increasing amount of pattern, structure and form contained within the human environment makes for an ideal and elegant explanation of the Flynn effect. It has all the essential features. For one, the increase mirrors the increase in raw IQ scores. Also, the environmental increase is continuous over time – just as with the rise in test scores. Plus the environmental increase is ubiquitous, people are exposed to it literally everywhere. Furthermore, in considering the content of an IQ exam — all those questions formed out of pattern, structure and form — one recognizes that navigating an IQ exam is not all that unlike navigating the maze of the surrounding world, and thus it cannot be too surprising that ambient pattern, structure and form must have something to do with human intelligence.

Here is how I would describe the situation. There are three different quantities in play: a person's neuronal/genetic capacity for intelligence, the raw IQ score, and the amount of pattern, structure and form contained within the human environment. We can let letters stand for these quantities:

C = a person's neuronal/genetic CAPACITY for intelligence,
S = the raw IQ SCORE, and
A = the AMOUNT of pattern, structure and form contained within the human environment.

A person's IQ score (S) is the combination of the two other factors (C and A) working independently. This can be expressed in a simple relationship:

S = C x A.

That relationship fits the experimental results to a tee. At any given point in time, the ambient pattern, structure and form (A) will be constant, so when experiments are run at that time, all the variation in S will be the direct result of the differing neuronal/genetic characteristics of each person, because those characteristics drive a person's capacity to demonstrate intelligence (C).

But over time the orthogonal effect holds sway. Over time, it is C that remains constant; we have little in the way of evidence to suggest that profound neuronal/genetic changes occur over time, just as to be expected under the tenets of biology and evolution. But if C remains essentially constant, then all the increase in S over time is explained solely by an increase in A. The rising amount of pattern, structure and form contained within the human environment causes IQ scores to increase over time and does so without impacting any person's neuronal/genetic characteristics.

Scientists have a hard time considering — let alone accepting — this description of intelligence because they are working under a mistaken assumption. Their evidence is so strong and compelling that variations in neuronal/genetic characteristics lead to corresponding differences in IQ scores that the scientists have somehow become convinced that the inference is valid also in the other direction — that is, that every change in IQ score is necessarily accompanied by changes in neuronal/genetic characteristics. That unsupported assumption is what leads everyone astray. For instance, all the complexity of the Dickens-Flynn model is driven entirely by a perceived need to have environmental influences and neuronal/genetic characteristics interact. But in point of fact that interaction is not called for at all, all the evidence clearly indicates that changes in those two factors are essentially independent.

If there is one major consequence to this simple description of the Flynn effect, it is that it compels a complete reassessment of the word intelligence. Because of the perceived bi-directional linkage of IQ scores and neuronal/genetic characteristics, scientists have been restricting use of the word intelligence to the domain of that linkage alone. But neuronal/genetic characteristics tell only half the story at best. If we want to come to a complete and accurate understanding of the nature of intelligence, then we must take into account also the amount of non-biological pattern, structure and form tangibly contained within the human environment. And if it is the increase in intelligence that we are trying to explain, then all the focus must go towards the structural human surroundings, because that is the only factor that changes over time.



Dickens, W. T., & Flynn, J. R. (2001). Heritability estimates versus large environmental effects: The IQ paradox resolved. Psychological Review, 108, 346-369.

Saturday, June 15, 2013

A Match Made in Hell

On her Twitter feed, Michelle Dawson picks up on the troubling circumstances autistic kids face on a daily basis:
In parsing that sentence, I notice that the first half (“There is a puzzle piece”) is supplied by the advocacy of groups such as Autism Speaks, and the second half (“missing in their brain”) is harnessed from the shibboleths of neuroscience. Talk about a nasty double team.

Sunday, June 2, 2013

Measures of Success

It's with split interest that I note the publication of Kuhl (2013) and the dissenting comments from two of the paper's reviewers, Jon Brock and Dorothy Bishop. As Brock and Bishop rightly note, Kuhl (2013) resorts to an after-the-fact cherry picking of data from a broad array of dubious measures, and presents those post-selected findings as significant. The dissents of Brock and Bishop are part of a slowly growing movement against such questionable techniques — a few in the scientific community have begun to recognize (perhaps with some egg on their faces) that all this post-hoc data mining might not be the best route forward in the advancement of human understanding. I applaud this growing dissent, feeble though it may be.

But there's also a bitter irony to be found here. As noted in my previous post, it is Dorothy Bishop herself who has pronounced, without the slightest hint of disingenuousness, the recipe for success in today's science: “If you want to make your way in the scientific world, there are two important things you have to do: get grant funding and publish papers.” Well, let's compare Kuhl (2013) against those criteria, shall we? Let's see, the paper was supported by grants from the National Institute of Mental Health and the National Institute of Child Health and Human Development. So get grant funding, check. And of course we're all discussing this paper precisely because it has appeared in the highly regarded PloS ONE journal. So publish papers, check. Heck, with the aid of a tacked-on co-author, Kuhl (2013) has even managed to score some media exposure, which will no doubt lead to further grant funding and more publications. So, bonus check check. By Dorothy Bishop's criteria for modern scientific success, Kuhl (2013) could only be described as a stunning achievement!

Listen, I know that Kuhl, Brock, Bishop and all their scientific colleagues mean well, but I'm one of those old-fashioned folks who tends to judge people on what they do, not on what they mean or say. And the one thing I can say Kuhl, Brock, Bishop and all their scientific colleagues manage to do in common is stand firmly behind, indeed even form, the machinery of modern science (grant funding, formal publication, peer review, academic credentialing, co-authorships, etc.). So I'm having a hard time seeing how any of them have earned the right to complain about the costs of that machinery. Because make no mistake about it, one of the costs of that machinery is the massive proliferation of papers such as Kuhl (2013). It's as inevitable as 2 following 1.

I'm impressed when the dissenters are willing to speak out against the problem, but I'll be even more impressed when the dissenters quit justifying and forming the conditions of the problem.




Kuhl PK, Coffey-Corina S, Padden D, Munson J, Estes A, et al. (2013) Brain Responses to Words in 2-Year-Olds with Autism Predict Developmental Outcomes at Age 6. PLoS ONE 8(5): e64967. doi:10.1371/journal.pone.0064967