This chapter discusses the concepts related to the design of experiments, as well as their application in the control of processes. In order for the technique to be applied correctly, the necessary steps are described here, as well as the basic principles that must be considered during the planning of experiments. The types of experimental design are also presented in this chapter, as well as the analysis of variance technique.
Design of experiments; Factor; Treatment; Randomization; Analysis of variance; One-way ANOVA; Two-way ANOVA; Factorial ANOVA
Statistical thinking will one day be as necessary for efficient citizenship as the ability to read and write.
H.G. Wells
The design of experiments (DOE) has been frequently used in the control of processes to identify the factors that impact the quality of products and services. Through the DOE, it is possible to improve the adjustment of the process and product design, as well as to reduce the time necessary to develop new products and processes.
Montgomery (2013) defines an experiment as a test or a series of tests in which intentional changes are carried out in relation to the input variables of a process, in order for it to be possible to observe and identify the corresponding changes in the output variable. The process presented in Fig. 21.1 transforms resources (inputs) into new products, goods, or services (outputs) for internal and external clients. Some process variables are controllable, while other variables are uncontrollable.
According to Montgomery (2013), the objectives of an experiment include:
We will discuss the main terms used in the design of experiments based on Banzatto and Kronka (2006).
A treatment or factor is any method, element, or material that one wishes to measure, test, or assess in an experiment. In an experiment, there may be more than one factor and more than one dependent or answer variable. Treatments or factors correspond to the model’s explanatory variables. In an experiment, there are at least two types of treatments that can be qualitative or quantitative. As examples of treatments, we have fertilizer, insecticide, different equipment to assess the noise level in the work place, different equipment to measure thermal stress, different methods to assess body composition, soil treatments to evaluate the production of watermelon and melon, different types of products, different ages, different periods of time, etc.
On the other hand, the term experimental unit corresponds to the unit, member, physical entity, or place to which the treatment is applied. This supplies the data to be analyzed. An experimental unit can be an animal, a patient, a plot of land, an engine, a piece of equipment, a customer, etc.
The variation that happens randomly in an experiment, due to uncontrollable variables, is called experimental error.
Montgomery (2013) describes the necessary steps for applying the DOE technique:
To ensure that the data are collected correctly, four basic principles must be considered during the design of experiments. Sharpe et al. (2015) describe each one of them:
Sharpe et al. (2015) and Banzatto and Kronka (2006) describe three types of design of experiments: (a) completely randomized, (b) randomized blocks, (c) factorial.
It is the simplest of all experimental designs. It only uses the principles of randomization and replication. The treatments are distributed in the units in a totally random way, with the same number or with a different number of replications. The CRD only considers one explanatory variable with two or more categories.
Imagine an experiment in which one wishes to test two types of diet with two groups of patients. Thus, 100 patients are randomly divided into 2 groups of the same size and the diets are assigned to these groups in a random way, as shown in Fig. 21.2.
One-way ANOVA has been widely used to analyze data coming from a completely randomized design.
It is the most common design. Besides the principles of randomization and replication, it also considers the principle of local control by creating blocks. Thus, the units are grouped into homogeneous blocks. For each block, we distribute different factors or treatments randomly. The main objective is to reduce the variability within each block and to identify the effect the factors have on the dependent or response variable.
The number of units per block is equal to the number of factors or treatments being studied. The factors or treatments are distributed in the units in a random way, in such a way that the randomization is carried out within each block.
Imagine an experiment with 600 patients from a health care clinic divided into two groups: healthier and not so healthy. For each group, 300 patients were selected randomly and three different treatments were assigned at random to these patients, in order for each subgroup with 100 patients to undergo a certain treatment. The main objective here is to analyze the effect of three types of food production systems on these patients’ health: (a) food from conventional production; (b) food from organic production; (c) food from biodynamic production. Fig. 21.3 describes this process in a simplified way.
When there are two or more factors in the experiment being carried out, the researcher uses the factorial design.
In an experiment with two factors, in each replication of the experiment, all the possible combinations of the levels of these factors are investigated. Therefore, if there are two factors A and B with a levels of factor A and b levels of factor B, then each replication contains all the a ⋅ b combination possibilities (Montgomery, 2013).
Two-way ANOVA has been broadly used to analyze data coming from a factorial design considering two factors.
A single factor or one-way analysis of variance (one-way ANOVA) has been widely used to analyze data obtained from a completely randomized design. These data could also be analyzed by using regression models.
According to Fávero et al. (2009), one-way ANOVA allows the researcher to verify the effect a qualitative explanatory variable (factor) has on a quantitative dependent variable. Each group includes the observations of the dependent variable in one of the factor’s categories.
One-way ANOVA was discussed in Section 9.8.1 in Chapter 9. All the concepts of the one-way ANOVA, its hypotheses, its model, and respective calculations can be found in that section in a very detailed way. The application of the one-way ANOVA is described in Example 9.12, as well as its solution on SPSS and on Stata software. In that example, the factor corresponds to the variable Supplier and the dependent variable is Sucrose.
Factorial ANOVA is an extension of the one-way ANOVA considering two or more factors. Factorial ANOVA assumes that the quantitative dependent variable is affected by more than one qualitative explanatory variable (factor). It also tests the possible interactions between the factors.
For Pestana and Gageiro (2008) and Fávero et al. (2009), the main goal of factorial ANOVA is to determine if the means for each factor level are the same (isolated effect of the factors on the dependent variable) and to verify the interaction between the factors (joint effect of the factors on the dependent variable).
Two-way ANOVA was discussed in Section 9.8.2.1 in Chapter 9. All the concepts of the two-way ANOVA, its hypotheses, its model and respective calculations can be found in that section. The application of the two-way ANOVA is described in Example 9.13, as well as its solution on SPSS and on Stata. In that example, the fixed factors correspond to the variables Company and Day_of_the_week, and the dependent variable is Time.
The two-way ANOVA can be generalized for three or more factors. According to Maroco (2014), the model becomes very complex, since the effect of multiple interactions can confound the effect of the factors (Section 9.8.2.2).
The design of experiments technique has often been used to control processes, aiming at identifying the explanatory variables or factors that affect the quality of products and services (dependent or response variable).
Among all the experimental designs, the completely randomized design is the simplest and considers only one explanatory variable with two or more categories. One-way ANOVA has been widely used to analyze data coming from a completely randomized design.
On the other hand, the randomized block design is used more frequently. Finally, when the experiment considers two or more factors, we use the factorial design. Two-way ANOVA has been broadly used to analyze data that comes from a design with two factors.
Table 21.2
Type of Converter | Type of Iron Ore | ||||||||
---|---|---|---|---|---|---|---|---|---|
Hematite | Limonite | Magnetite | |||||||
Bessemer | 161 | 154 | 149 | 145 | 151 | 154 | 168 | 165 | 174 |
157 | 163 | 150 | 141 | 147 | 153 | 163 | 175 | 172 | |
161 | 165 | 156 | 139 | 155 | 140 | 181 | 182 | 180 | |
LD | 164 | 169 | 152 | 134 | 144 | 140 | 165 | 164 | 177 |
149 | 155 | 164 | 139 | 142 | 149 | 181 | 183 | 165 | |
167 | 159 | 160 | 133 | 129 | 137 | 167 | 178 | 179 | |
Siemens-Martin | 169 | 165 | 152 | 135 | 141 | 148 | 165 | 166 | 183 |
154 | 163 | 167 | 130 | 142 | 129 | 175 | 178 | 179 | |
159 | 151 | 165 | 137 | 135 | 141 | 164 | 183 | 179 |