Chapter 21

Design and Analysis of Experiments

Abstract

This chapter discusses the concepts related to the design of experiments, as well as their application in the control of processes. In order for the technique to be applied correctly, the necessary steps are described here, as well as the basic principles that must be considered during the planning of experiments. The types of experimental design are also presented in this chapter, as well as the analysis of variance technique.

Keywords

Design of experiments; Factor; Treatment; Randomization; Analysis of variance; One-way ANOVA; Two-way ANOVA; Factorial ANOVA

Statistical thinking will one day be as necessary for efficient citizenship as the ability to read and write.

H.G. Wells

21.1 Introduction

The design of experiments (DOE) has been frequently used in the control of processes to identify the factors that impact the quality of products and services. Through the DOE, it is possible to improve the adjustment of the process and product design, as well as to reduce the time necessary to develop new products and processes.

Montgomery (2013) defines an experiment as a test or a series of tests in which intentional changes are carried out in relation to the input variables of a process, in order for it to be possible to observe and identify the corresponding changes in the output variable. The process presented in Fig. 21.1 transforms resources (inputs) into new products, goods, or services (outputs) for internal and external clients. Some process variables are controllable, while other variables are uncontrollable.

Fig. 21.1
Fig. 21.1 Flowchart of a process.

According to Montgomery (2013), the objectives of an experiment include:

  1. 1) Determining which input variables most influence the dependent or the y answer variable;
  2. 2) Determining the set of xs variables and their respective values so that y is as close as possible to the desired value;
  3. 3) Determining the set of xs variables and their respective values so that the variability in y is the smallest possible;
  4. 4) Determining the set of xs variables and their respective values so that the effects of the uncontrollable variables z are minimized.

We will discuss the main terms used in the design of experiments based on Banzatto and Kronka (2006).

A treatment or factor is any method, element, or material that one wishes to measure, test, or assess in an experiment. In an experiment, there may be more than one factor and more than one dependent or answer variable. Treatments or factors correspond to the model’s explanatory variables. In an experiment, there are at least two types of treatments that can be qualitative or quantitative. As examples of treatments, we have fertilizer, insecticide, different equipment to assess the noise level in the work place, different equipment to measure thermal stress, different methods to assess body composition, soil treatments to evaluate the production of watermelon and melon, different types of products, different ages, different periods of time, etc.

On the other hand, the term experimental unit corresponds to the unit, member, physical entity, or place to which the treatment is applied. This supplies the data to be analyzed. An experimental unit can be an animal, a patient, a plot of land, an engine, a piece of equipment, a customer, etc.

The variation that happens randomly in an experiment, due to uncontrollable variables, is called experimental error.

21.2 Steps in the Design of Experiments

Montgomery (2013) describes the necessary steps for applying the DOE technique:

  1. 1. Defining the problem: a clear definition of the problem and of the experiment’s objectives significantly helps to better understand and solve the problem.
  2. 2. Choosing the factors and levels: in an experiment, we define the factors and their respective variation ranges, and the specific levels used in the procedure.
  3. 3. Defining the response variable: we usually use the mean or the standard deviation (or both) of the characteristic being evaluated as the response variable.
  4. 4. Choosing the type of design: Section 21.4 discusses the types of design of experiments.
  5. 5. Conducting the experiment: when conducting the experiment, it is necessary to monitor the process carefully in order to ensure that the experiment will happen as planned.
  6. 6. Data analysis: statistical techniques are used to analyze the data from the experiment.
  7. 7. Conclusions and recommendations: To validate the experiment’s results and conclusions, graphical methods and also exploratory and confirmatory tests are used.

21.3 The Four Principles of Experimental Design

To ensure that the data are collected correctly, four basic principles must be considered during the design of experiments. Sharpe et al. (2015) describe each one of them:

  1. 1. Randomization: this principle consists in randomly distributing the treatments in the experimental units, in a way that each treatment has the same chance of occupying any experimental unit. This principle minimizes the effects of unknown and uncontrollable variables.
  2. 2. Replication: is the number of times each treatment appears in the experiment. If the number of repetitions is the same for each treatment, we have a balanced experiment. Through replication, we can estimate the experimental error, increase the experiment’s accuracy, and even increase the robustness of the statistical tests.
  3. 3. Control: controlling odd sources of variation significantly reduces the variability of the response variables, making it easier to discern the differences between the experimental units or treatment groups. In test drives, for example, all of the alternatives must be offered to customers at the same time and in the same conditions. Otherwise, external variables, such as, the price of gasoline, volatility of the stock market, fluctuations in the interest rate, among others, would make it difficult to assess the effects of the treatments.
  4. 4. Blocking: in some cases, there may be one uncontrollable factor that directly affects the response variable or the way in which the factors being studied influence the response. To minimize this effect, the factors are grouped into blocks or homogeneous groups, in such a way that each experiment is analyzed separately for each block. Different from the first three principles, blocking is not necessary in all experiments.

21.4 Types of Experimental Design

Sharpe et al. (2015) and Banzatto and Kronka (2006) describe three types of design of experiments: (a) completely randomized, (b) randomized blocks, (c) factorial.

21.4.1 Completely Randomized Design (CRD)

It is the simplest of all experimental designs. It only uses the principles of randomization and replication. The treatments are distributed in the units in a totally random way, with the same number or with a different number of replications. The CRD only considers one explanatory variable with two or more categories.

Imagine an experiment in which one wishes to test two types of diet with two groups of patients. Thus, 100 patients are randomly divided into 2 groups of the same size and the diets are assigned to these groups in a random way, as shown in Fig. 21.2.

Fig. 21.2
Fig. 21.2 An example of a completely randomized design. (Modified from Sharpe, N.R., de Veaux, R.D., Velleman, P.F., 2015. Business Statistics. 3rd ed. Pearson Education.)

One-way ANOVA has been widely used to analyze data coming from a completely randomized design.

21.4.2 Randomized Block Design (RBD)

It is the most common design. Besides the principles of randomization and replication, it also considers the principle of local control by creating blocks. Thus, the units are grouped into homogeneous blocks. For each block, we distribute different factors or treatments randomly. The main objective is to reduce the variability within each block and to identify the effect the factors have on the dependent or response variable.

The number of units per block is equal to the number of factors or treatments being studied. The factors or treatments are distributed in the units in a random way, in such a way that the randomization is carried out within each block.

Imagine an experiment with 600 patients from a health care clinic divided into two groups: healthier and not so healthy. For each group, 300 patients were selected randomly and three different treatments were assigned at random to these patients, in order for each subgroup with 100 patients to undergo a certain treatment. The main objective here is to analyze the effect of three types of food production systems on these patients’ health: (a) food from conventional production; (b) food from organic production; (c) food from biodynamic production. Fig. 21.3 describes this process in a simplified way.

Fig. 21.3
Fig. 21.3 An example of a randomized block design.

21.4.3 Factorial Design (FD)

When there are two or more factors in the experiment being carried out, the researcher uses the factorial design.

In an experiment with two factors, in each replication of the experiment, all the possible combinations of the levels of these factors are investigated. Therefore, if there are two factors A and B with a levels of factor A and b levels of factor B, then each replication contains all the a ⋅ b combination possibilities (Montgomery, 2013).

Two-way ANOVA has been broadly used to analyze data coming from a factorial design considering two factors.

21.5 One-Way Analysis of Variance

A single factor or one-way analysis of variance (one-way ANOVA) has been widely used to analyze data obtained from a completely randomized design. These data could also be analyzed by using regression models.

According to Fávero et al. (2009), one-way ANOVA allows the researcher to verify the effect a qualitative explanatory variable (factor) has on a quantitative dependent variable. Each group includes the observations of the dependent variable in one of the factor’s categories.

One-way ANOVA was discussed in Section 9.8.1 in Chapter 9. All the concepts of the one-way ANOVA, its hypotheses, its model, and respective calculations can be found in that section in a very detailed way. The application of the one-way ANOVA is described in Example 9.12, as well as its solution on SPSS and on Stata software. In that example, the factor corresponds to the variable Supplier and the dependent variable is Sucrose.

21.6 Factorial ANOVA

Factorial ANOVA is an extension of the one-way ANOVA considering two or more factors. Factorial ANOVA assumes that the quantitative dependent variable is affected by more than one qualitative explanatory variable (factor). It also tests the possible interactions between the factors.

For Pestana and Gageiro (2008) and Fávero et al. (2009), the main goal of factorial ANOVA is to determine if the means for each factor level are the same (isolated effect of the factors on the dependent variable) and to verify the interaction between the factors (joint effect of the factors on the dependent variable).

Two-way ANOVA was discussed in Section 9.8.2.1 in Chapter 9. All the concepts of the two-way ANOVA, its hypotheses, its model and respective calculations can be found in that section. The application of the two-way ANOVA is described in Example 9.13, as well as its solution on SPSS and on Stata. In that example, the fixed factors correspond to the variables Company and Day_of_the_week, and the dependent variable is Time.

The two-way ANOVA can be generalized for three or more factors. According to Maroco (2014), the model becomes very complex, since the effect of multiple interactions can confound the effect of the factors (Section 9.8.2.2).

21.7 Final Remarks

The design of experiments technique has often been used to control processes, aiming at identifying the explanatory variables or factors that affect the quality of products and services (dependent or response variable).

Among all the experimental designs, the completely randomized design is the simplest and considers only one explanatory variable with two or more categories. One-way ANOVA has been widely used to analyze data coming from a completely randomized design.

On the other hand, the randomized block design is used more frequently. Finally, when the experiment considers two or more factors, we use the factorial design. Two-way ANOVA has been broadly used to analyze data that comes from a design with two factors.

21.8 Exercises

  1. 1) An aerospace company manufactures civilian and military helicopters at its three factories. Table 21.1 shows its monthly helicopter production in the last 12 months, in each factory. Check and see if there is a difference between the population means. Assume that α = 5%.

    Table 21.1

    Monthly Helicopter Production for Each Factory
    Factory 1Factory 2Factory 3
    242829
    262625
    282424
    223026
    312420
    252722
    272522
    282927
    303020
    212726
    202624
    242525
  2. 2) A steel company wants to know how the factors “Type of iron ore” and “Type of converter” affect the properties of steel, more specifically the Brinell hardness (BH), measured in kgf/mm2. In order to do that, an experiment with 81 samples was carried out, with 3 types of iron ores (hematite, limonite, magnetite) and 3 types of converters (Bessemer, LD, and Siemens-Martin). For each experimental unit, the hardness was measured. The data are available in Table 21.2.

    Table 21.2

    Brinell Hardness (BH) per Type of Iron Ore and Converter
    Type of ConverterType of Iron Ore
    HematiteLimoniteMagnetite
    Bessemer161154149145151154168165174
    157163150141147153163175172
    161165156139155140181182180
    LD164169152134144140165164177
    149155164139142149181183165
    167159160133129137167178179
    Siemens-Martin169165152135141148165166183
    154163167130142129175178179
    159151165137135141164183179

    t0015

  3. 3) A gas and oil company wants to understand how petroleum refining processes and the type of petroleum impact gasoline quality parameters, more specifically its octane rating. In order to do that, an experiment with 48 samples was carried out, considering 4 petroleum refining processes (distillation, cracking, reforming, and alkylation) and 3 types of petroleum (light, naphthenic, and paraffinic). For each experimental unit, the octane rating was measured. The data are available in Table 21.3.

    Table 21.3

    Octane Rating per Type of Petroleum and Refining Process
    Type of PetroleumPetroleum Refining Process
    DistillationCrackingReformingAlkylation
    Light9595959796949596
    9494959694939695
    Naphthenic8786899086878991
    8687889087859089
    Paraffinic9091929190918992
    9289909292909291

    t0020

References

Banzatto D.A., Kronka S.N. Experimentação agrícola. fourth ed. Jaboticabal: Funep; 2006.

Fávero L.P., Belfiore P., Silva F.L., Chan B.L. Análise de dados: modelagem multivariada para tomada de decisões. Rio de Janeiro: Campus Elsevier; 2009.

Maroco J. Análise estatística com o SPSS Statistics. sixth ed. Lisboa: Edições Sílabo; 2014.

Montgomery D.C. Introduction to Statistical Quality Control. seventh ed. Arizona State University: John Wisley & Sons, Inc; 2013.

Pestana M.H., Gageiro J.N. Análise de dados para ciências sociais: a complementaridade do SPSS. 5. ed. Lisboa: Edições Sílabo; 2008.

Sharpe N.R., de Veaux R.D., Velleman P.F. Business Statistics. third ed. Pearson Education; 2015.


"To view the full reference list for the book, click here"

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset