Design Your Experiments: Quasi-experiments


			Society for Amateur Scientists Quasi-experiments

Sponsored by:

Design Your Experiments Part XIV: Quasi-experiments

by Kevin Kilty

We cannot run a carefully designed experiment to answer every question. There isn't enough money or resources. There isn't always time. Occasionally, experiments are impossible. Here are a few examples:

During an epidemic we need to identify risk factors quickly.
Social scientists use observation or surveys of huge populations to gather basic data. Equivalently large experiments are not possible.
Geologists and sometimes geophysicists make observations of phenomena deep underground, or very far removed in time. The inference space is inaccessible.
Astronomers have to observe events very far away where control is impossible.

If controlled experiments are not possible, then how does science progress? The short answer is that we have to resort to methods that fall a little short of being experiments. I don't want to imply that these methods are not useful. A great deal of very valuable work comes from them. Especially if one considers the huge value of proper identification of risk factors involved in an epidemic. However, in some cases these short-falling methods have become science itself, and we might do better.

Quasi-experiments

A quasi-experiment is a study which has most of the trappings of an experiment, but which is unable to control potential factors, or perhaps is not guided by an idea of what all the factors are. This lack of control sometimes leaves quasi-experiments with dubious outcomes. Often they lack internally consistent logic, and one can often find the potential for circular logic, and other invalid reasoning. Quasi-experiments with controversial outcomes almost always become embroiled in arguments over this lack of internal consistency.

Paradoxically, much of what geologists and geophysicists do is quasi-experimental. Despite the hard-science like appearance of making measurements with instruments, geophysics is plaqued by confounding and the influence of uncontrolled factors. Cause and effect is even difficult to ascertain at times.

A claim that scientists make for quasi-experiments is that they have external validity, by which they mean that the conclusions are widely applicable for the reason that they are drawn from a phenomenon as it exists in the field. I'm not entirely convinced of this. To the extent that one may draw correct conclusions from a quasi-experiment, then I'd say this claim is true. Here are a couple of complicating issues, though.

Limitations of Quasi-experiments

How does a person draw wrong conclusions from a quasi-experiment? There are two ways. The first is through lack of control. Unless there is some way to identify and control potentially confounding factors, then a person can lose control of the explicit study factors in the quasi-experiment. I'll call this unrecognized confounding just to have a label to apply. Another possibility is that inadequate understanding of cause and effect, combined with a quasi-experiment that is statistics driven, will produce false associations.

Unrecognized confounding

In Part XIII, I mentioned a mis-use of the Latin square design which I didn't explain at the time. Let me explain now. Notice that the Latin square design explicitly lists the nuisance factors in a 2-way classification. I cannot analyze these factors explicity, though, since to do so will violate what I'm trying to accomplish with the squared design. To show this let me present an example from Barker's book. Consider the 2x2 square with two treatments--these two being one factor (F) at two different levels (F(+1),and F(-1)). The two nuisance factors (N₁and N₂) are also at two levels each. The square design is...

                              Nuisance 1
                              +1      -1
                      ---------------------
     Nuisance 2       +1      F(+1)   F(-1)
                      -1      F(-1)   F(+1)
                      ---------------------

There are only 4 treatments in this square. I can also organize this design in Yates order. The important portion of the Yates table is...

Yates Table of Latin Square Design


    F    N₁    N₂    FN₁   FN₂   N₁N₂
    -------------------------------
    -1   +1    -1    -1    +1    -1
    +1   +1    +1    +1    +1    +1
    -1   -1    +1    +1    -1    -1
    +1   -1    -1    -1    -1    +1
    -------------------------------

The problem is apparent immediately. Look at the columns which are identical, or, in other words, the factors that are perfectly correlated with one another. The nuisance factor N₂ is aliased with the interaction FN₁, the nuisance factor N₁ is aliased with FN₂, and, worst of all, F is aliased with N₁N₂. If I try to assign an effect to one of the nuisance factors I find I can't guarantee that the effect is not actually an interaction between my control factor and the other nuisance factor. If I fail to recognize this implicit confounding I run the risk of misinterpreting my results. The lesson is that an incomplete block design, which is guaranteed to happen in a quasi-experiment if I cannot recognize potentially confounding factors, will confound or alias the control factor with some interaction of nuisance. As always, knowledge about the factor and nuisance is indispensible.

I know that many people will say to me at this juncture, "Yes, but the nuisances are just random factors and their interaction should average to zero in a blocked design." My response is that this may be true in a large enough design, but this 2x2 block is too small for the law of large numbers to apply and the probability of a confounding interaction is a near certainty. Latin squares are never very large. Quasi-experiments may or may not be large enough for the law of large numbers to take away the risk, and even if they are, does a person demonstrate that the effects of confounding factors are averaged away? These will not average away if the continual replication of the Latin square above, for example, does not randomize the confounders. You cannot average away correlated noise.

This problem with not recognizing potentially confounding factors also affects the inference space of the quasi-experiment. The region of space where I obtained data with my experiment is the portion over which my results are most likely valid. One complaint made about designed experiments is that their inference space can be so restricted that their results do not pertain to any useful situation in the real world. True enough. A quasi-experiment, on the other hand, may not be able to identify exactly what this inference space is, which now leaves it open to constant attack. The results of an experiment are not very useful if people refuse to accept them.

Let me provide an example just bound to get me in trouble. We often hear about people investigating cancer clusters. How does the study of such a cluster proceed? I fear that it generally begins with a quasi-experiment designed to look at environmental factors--surveys and so forth. These use proxies in place of measurements of risks. In one that I know of, the proxy for exposure was job title. The researchers assumed that women who were electrical engineers were more highly exposed to "electromagnetic radiation." In fact, knowing something about how EEs work suggests to me that they are less exposed than other groups. Therefore, I have to wonder about the inference space of this quasi-experiment.

The real issue might be to find a nuisance factor that brought the cancers together into a cluster rather than a risk factor that caused the cancers. Random occurences are always bound to occur in clusters now and then. I won't learn a thing by studying why a fair coin came up heads 10 in a row beginning at flip 575. Unfortunately the general public are never satisfied with null results on a highly emotional subject. This is how we have managed to spend at least 10 billion dollars over 40 years studying the elusive link between cancer and electromagnetic interference.

False associations

Quasi-experiments are driven by the data they generate and by the statistical tools used to analyze the data. This seems to leave out the most important part of the problem--the science, or something equivalent like causality. What we are trying to do is establish cause and effect so that we may make predictions, and eventually perform tests.

Let me make an example. Suppose that I am interested in some question of whether particular dieting causes a disease. Now I can organize very carefully controlled experiments to explore this issue. But no matter how important the question, and no matter how valid the conclusions I might draw from my experiment, I'll always find myself in trouble if I give my experimental subjects a disease. Thus I am reduced by ethical concerns to making a study of people on a diet, or not, quite of their own accord, and people who contract disease, or not, seemingly at random.

Let me take the worst possible case. Suppose in reality there is no association between the dieting and the disease. Ideally, the outcome of my quasi-experiment would show no statistical association between the two. Yet the world is full of associations that are not of the direct cause and effect sort. In particular, what if the disease and dieting both had an association with some other factor not in the direct line of causation from dieting to disease? I can offer one obvious example, both dieting and disease lead to loss of weight. If the data collected during the quasi-experiment are segregated by this additional factor, then a false relationship will appear between dieting and disease, even though there is no direct, cause and effect relationship between them.

Epidemiologists express risk, or association, in a very different way than I have in my other examples in this series. They use an odds ratio, which is a ratio of two ratios. It is the ratio between the ratio of subjects who develop the malady to those who do not among the sample that is exposed to the risk, to the same ratio among the sample who aren't exposed to the risk. A neutral value of odds ratio is 1.0, or nearly so. However, false associations do not depend on any particular way of measuring a relationship. The measurement could be a coefficient in a model built using regression and the problem is the same. Recall in the example of the snap beans that any factor which merely mimicked the pattern of temperature in the design matrix (X) appeared significant.

This example of dieting and disease is very simple, of course, and it's unlikely that a competent researcher wouldn't notice what had happened. However, the example serves to illustrate how a false association could occur. Would such a thing ever happen? The answer is yes; anything that can happen will. In a more complex situation a researcher might depend on a pure statistical assessment of a large number of potential factors. This could illuminate one false factor in the bright light of statistical significance. If this factor were added because of its statistical significance, and the data segregated using it, voila' a spurious cause and effect would appear. The true nature of the association wouldn't be obvious without a detailed model of cause and effect.

Just as happened in my example of the storage of snap beans, an understanding of the physics of the problem, or cause and effect if you will, is indispensible for drawing valid conclusions.

Non-experiments

Quasi-experiments are not necessarily invalid, but they are prone to more problems than are carefully designed experiments. Incompetence, on the other hand, whether it is applied to true experiments or quasi-experiments is another matter. A person can turn a perfectly good experiment into a non-experiment easily. Overly complex models, the mis-use of blocking and Latin squares, misuse of regression or data reduction, and so forth are all potential risks.

Improving Quasi-experiments.

My own thinking on quasi-experiments is not well enough polished to offer advice on how a person could improve them. Yet, my gut feeling is that in most cases a person could make them much more like true experiments with only a well-founded model of the process under study. By this I mean only a consistent model of cause and effect, or a thorough knowledge of the physics involved. I'm working on an example, a realistic one involving geology, which I hope to have ready for the conference in June in Philadelphia.

Closing remarks

This is my last installment of the series. I have found writing it to be an effort at times, but the effort has helped me straighten out my own thinking on experiments. I hope these materials help a few amateurs design good experiments as well, but I expect that better help is available through the dozen or so references I have supplied along the way. Good luck to you all.

Reprinted from: