Design Your Experiments: Other Experiment Designs


			Society for Amateur Scientists Other Experiment Designs

Sponsored by:

Design Your Experiments Part XIII: Other Experiment Designs

By Kevin Kilty

It is about time for a short installment. So far in this series, I have concentrated on factorial designs and single factor designs. I need to make brief mention of a few other experiment designs.

Nested Designs

If you refer back to Parts VI and X, I provided an example experiment regarding the storage of snap beans. This experiment looked like a factorial design, but it called for experiment runs at more than two temperatures (T) and more than two storage periods (W). If you look at the design carefully, you will notice that the experiment calls for the same storage periods to be run at each temperature level. This is a nested design. It takes the form of a tree-like structure. Nested designs are not an exclusive category, because a full 2² factorial design is also a nested design.

If I run each treatment in the snap bean experiment once, then its particular design provides 12 independent measurements to determine factor ceofficients in a model. The model I suggested at the time was Y = C₀+C_tw*T*W; although I argued in in Part that a model like Y = C₀+C_w*W+C_tw*T*W is just as physically plausible. Obviously, if my model has only 3 coefficients, then the experiment design has a lot of implicit replication to estimate noise. The nested design of the snap bean experiment has sufficient treatments to estimate C₀ and 11 other coefficients in any model, but no more. Moreover, whatever model I propose cannot contain terms beyond T² nor beyond W³, because there are not enough temperature or storage time levels among the treatments for any others. Even if I ran replications at each treatment, I could determine no further coefficients. Replication of this sort at each treatment would serve only to give me a better grasp of noise.

Nested designs are not necessarily an efficient design. They may call for too many treatments to answer certain questions. Refer back to Part X where I calculated the hat matrix for the snap bean experiment. Some of the treatments were much more influential, in fact 9 times more, than others. An efficient design could eliminate some of the least influential treatments. However, despite its lack of efficiency at times, the nested design is useful for particular problems, and it has the advantage that I can use analysis of variance (ANOVA) in a very clear manner to analyze its results. For those of you who are unfamiliar with ANOVA, I'll just refer to my brief lab note on the topic in the November 16, 2001, issue of the SAS Bulletin. But a very informative book which covers both experiment design and ANOVA is Quality by Experimental Design by Thomas B. Barker, 1986, Marcel Dekker, Inc. Let me illustrate with an example.

Manufacturers are always concerned about being able to reliably inspect parts. If I plan to inspect huge lots of parts, then I am forced to use several different inspectors using several different instruments. In such a circumstance I have to be concerned that the different operators and instruments are sources of variation in the inspection process, which might make it impossible to guarantee some level of precision even if the manufacturing process works very well. Therefore I will undertake an experiment, known in manufacturing as Gage Repeatability and Reproducability (GageR&R) to prove my inspection process. The experiment uses a nested design. Let O_x indicate different operators, I_x indicate different instruments, and P_x placements of one or a set of parts on the instrument. This set of parts remains constant throughout the experiment, so that I know that there is no manufacturing variation during the experiment that I have to control. The table below summarizes my design.


Operators             O₁                  O₂
Instruments        I₁     I₂           I₁     I₂
Parts           P₁...P_k P₁...P_k      P₁...P_k P₁...P_k

Each operator uses each instrument and measures the same lot of parts each time. I will randomize my runs and replications to eliminate systematic problems, such as operator 1 being alert in the morning and tired just after noon, and so forth. The data which results contains the same number of replications for each treatment, which allows me to use analysis of variance in a simple manner to compare treatments and decide if the factors of instruments, fixturing of part, or operators contribute unusual variation. The uncertainty contributed jointly by the first two factors determine repeatability and that contributed additionally by last one determines reproducability.

Blocked designs and Latin Squares

I mentioned blocking in an earlier installment as a way to control for noise. One example I provided was that of paired samples. Each pair of experimental units has the same treatment assigned to it as any other pair, and what we examine is the difference of outcome between each pair. This design allows a person to control a single nuisance factor. If there are two nuisance factors then a Latin square is a reasonable design.

I briefly mentioned Latin squares before, and I alluded to their use in controlling for fertility gradients in fields during agricultural experiments. The two nuisance factors in this case are gradient components in x,y directions in the field. However, Latin squares are much more flexible than this as I can use one to control any two nuisance factors. Obviously an example is in order.

Suppose that I am testing a brandname drug against three generic brands. I have therefore 4 treatments I wish to test. I'll designate these with 4 Latin letters, A, B, C, and D. I need to perform the test through a group clinics and on a group of patients. I know from previous experience that I am likely to get different results from different patients just because people are onery. I am also likely to get different results on identical drugs from different clinics or shifts because the staff in the clinics may have biases or they may employ other unknown factors. Therefore I have to randomize for these two sources of nuisance noise. I organize my experiment in the form of a 4x4 table as follows.

                                Clinic
                   Patient    1  2  3  4
                   ---------------------
                      1       A  B  C  D
                      2       B  C  D  A
                      3       C  D  A  B
                      4       D  A  B  C
                   ---------------------

This is a very orderly arrangement which is possibly suitable for my experiment. However, it is still susceptible to some systematic factor if one exists, so I may further randomize the Latin square by randomly arranging the rows and columns. This will preserve the important characteristic of the Latin square--being that it has one of each treatment in each row and column. One possible result is

                                Clinic
                   Patient    2  1  4  3
                   ---------------------
                      2       C  B  A  D
                      1       B  A  D  C
                      3       D  C  B  A
                      4       A  D  C  B
                   ---------------------

I can draw lots to assign each clinic a number, and I can let the clinic assign patients a number as they come in the door. It would require a very unusual pattern of bias on the part of clinics, and patient arrivals to produce a systematic bias in this study. If I need more replication I can simply repeat this design over more trials.

The Latin square requires as many rows and columns as there are treatments to test which makes it cumbersome for experiments involving many treatments. You will notice, no doubt that there are 16 degrees of freedom in this design for only 4 treatments. For this reason some people complain that Latin square designs use-up degrees of freedom inefficently. However, this is exactly why a Latin square is useful. It sacrifices degrees of freedom to control the unknown factors of patient and clinic variability. An appraisal of the efficency of this design, which is a comparison of the number of runs and replications required to achieve a particular level of certainty, shows that Latin square designs can be much more efficient than completely randomized designs.

It is easy to mis-use a squared design. In order to shorten this installment, I'll postpone a discussion of this topic of mis-use to Part XIV, the last installment in this series.

Intentional Confounding

Because Latin designs are very useful, and because they also grow very large if there are many treatments, a common variation on them is to use several smaller squares in a single design and purposefully alias particular interactions among factors in each one. Intentionally aliasing or confounding allows me to use less than a full set of treatments in each Latin square, which economizes the experiment. This design idea is called confounding, which is just an alternative term for aliasing. The general rule for intentional confounding is to not alias main effects, or what I sometimes call primary factors. However, it is better for a person to understand the system under study so well that he or she can design the confounding to not alias important interactions of any kind.

Composite Designs

Let me return to a couple of earlier experiment designs and show what these look like graphically. A full 2² factorial design creates a square pattern of 4 different treatments in response space that looks like this...

                            B axis
                               |
                         *   1 +     *
                               |
              A axis    -1     |    +1
                ---------|-----------|----------
                               |
                               |
                         *  -1 +     *
                               |

The nested design of the snap beans experiment, on the other hand filled a rectangular region of space with 12 treatments like so

                            W axis
                               |
                               |
                             8 *     *     *
                             6 *     *     *
                             4 *     *     *
                             2 *     *     *
              T axis           |    +10   +20
                ---------------------|-----|-----

I mentioned that the snap bean design was not especially efficient, but that it could provide information on higher order terms in T² or W³. A design that can do much the same using only 9 treatments is the composite between the full 2² factorial, and a star pattern with one treatment at a central position. This particular design is called a Central composite design (CCD) because of its centered and symmetric form. An unlimited number of other composites are possible, including different patterns in different parts of response space, each designed to efficiently answer a particular question.

                            W axis
                               |
                               * 
                               |
                         *     +     *
                               |
              T axis           |    
                -----*---|-----*------|--*--------
                               |
                               |
                         *     +     *
                               |
                               *        A central composite design
                               |

By the way, a drawing like this, which shows the locations of treatments in space some people refer to as inference space because it illustrates the portion of space over which the experiment will provide data and over which a person can draw valid inferences.

Evaluation of Operation

This is a design that lets me gather data on a process while it operates. I can use it to continually adjust and optimize the process, for example. It allows a systematic investigation of the response surface during operation. Because one requirement is for my experiment to not produce bad product, the design is limited to making very small changes to the controlling factors, and then evaluating the results. The design is a full 2² factorial supplemented with a single center point. It is an abbreviated central composite design in other words. The treatment which this center point represents is the current operation target of the process. The four surrounding treatments are very small steps away from the target. Because they are small steps, the changes in output of the process are also very small, so small in fact that I have to worry about them being the result of noise and not being of any significance. Luckily, the process is running the entire time so I can replicate this experiment over and over to reduce uncertainty.

The sequence of this design is as follows.

I obtain data at all treatments dictated by my central composite design.
I use the range of values in this data to estimate noise. I continue to replicate esperiments until the range of all measured values suggests a noise value so small that I can assign significance to my results.
I examine the difference between the replicated central values, and the average interaction of the other 4 2² factorial treatments to decide if my response surface is nearing or at an extremum. If it is I have found my objective, which is to locate the optimum operating treatment.
If I haven't found my optimum, I use the full 2² factorial design, averaged over all replications, to calculate main effects and interaction. This tells me what direction in response space to adjust my process for better output.
I make small adjustments in this direction and return to the first step in the experiment.

The result of this activity is that slowly I will move toward an optimum control of a process along the steepest path on the response surface; or, even more important in some cases, I can follow a slowly drifting process to maintain consistent quality or yield.

Reprinted from: