Design and Analysis of Experiments Using R

class: center, middle, inverse, title-slide

.title[
# Design and Analysis of Experiments Using R
]
.subtitle[
## Fractional Factorial Designs
]
.author[
### Olga Lyashevska
]
.date[
### 2022-11-17
]

---

# Recap of Week 5 
<img src="figs/roadmap-crfd.png" style="width: 80%" />

CRFD - Completely Randomised Factorial Design

---
# CRFD

Completely Randomised Factorial Design is full factorial experiment whose design consists of two or more factors, each with discrete possible values or "levels", and whose experimental units take on all possible combinations of these levels across all such factors.

---
# A special case: `\(2^{k}\)` factorial design

A design with `\(k\)` factors, each at only two levels.

These levels may be quantitative, such as two values of temperature, pressure, or time; or they may be qualitative, such as two machines, two operators, the “high” and “low” levels of a factor, or perhaps the presence and absence of a factor.

A complete replicate of such a design requires

`\(2 × 2 × · · · × 2 = 2^k\)` observations and is called a `\(2^{k}\)` factorial design.

---
# A special case: `\(2^{k}\)` factorial design

A `\(2^{k}\)` factorial design is particularly useful in the early stages of experimental work when many factors are likely to be investigated.

It provides the smallest number of runs with which `\(k\)` factors can be studied in a complete factorial design.

---
# Advantages of factorial design

- More efficient than one-factor-at-a-time experiments.

- Necessary when interactions may be present to avoid misleading conclusions.

- Allow the effects of a factor to be estimated at several levels of the other factors, yielding conclusions that are valid over a range of experimental conditions.

---
# Furthermore

There are 2 benefits to studying several treatment factors simultaneously
in a factorial design.

1. The interaction (joint effects) of the factors can be detected.

2. The experiments are more efficient.

The same precision of effects can be achieved with fewer experiments than would
be required if each of the factors was studied one-at-a-time in separate experiments.

The more factors included in a factorial design, the greater the efficiency and the greater the number of interactions that may be detected.
---
# Disadvantages of factorial design

The more factors included in a factorial experiment, the greater the number of runs that must be performed  even when there are only two levels of each factor and only one replicate.

For even a moderate number of factors, the total number of treatment combinations in a `\(2^k\)` factorial design is large.

For example, a `\(2^5\)` design has 32 treatment combinations, a `\(2^6\)` design has 64 treatment combinations, and so on.

---

With `\(k = 7\)` or more factors it is usually impractical.

---
Consider a two-level 7 factor factorial design:

`\(2^{7}\)` = 128 runs

- 1 overall average
- 1 main effect for each factor
- etc.

---
# Screening

In the preliminary stage of experimentation, where the objective may be
to determine which factors are important (screening) from a long list of candidates, a
factorial design may require too many experiments to perform.

This is often not practical or achievable.
---
# If number of factors is large...

...researchers will often abandon the efficiency of factorial experiments altogether .

- They either revert to vary one-factor-at-a-time design (CRD). 
- Or will run a factorial experiment with a subset of the factors, chosen from the longer list by guessing which ones may be more important.

However, these approaches are less than optimal and do not retain the benefits of factorial experiments with a large number of factors.

---
# A better solution

A better solution to this problem is to use a **fraction** of the experiments, or runs, required for a full factorial experiment.

To be effective, the fraction of runs used must be carefully selected in order to preserve some of the benefits of a full factorial experiment.
---

# This week: Fractional Factorial Design
<img src="figs/roadmap-crff.png" style="width: 80%" />

CRFF - Completely Randomised Fractional Factorial Design

---
# Sparsity-of-effects principle

In preliminary experiments involving a large number of factors, usually only a small proportion of the factors will have significant effects.

This fact has been called **sparsity-of-effects principle** by Box and Meyer.

Box, G. E. P. and Meyer, R. D. (1986a). An analysis for unreplicated fractional
factorials. Technometrics, 28, 11–18. <a href="../../books/box1986.pdf">Download </a>: right click --> Save link as

A system is usually dominated by main effects and low-order interactions.
---
# Assumption

Assuming that certain high-order interactions are negligible, information on the main effects and low-order interactions may be obtained by running only a fraction of the complete factorial experiment.

These fractional factorial designs are among the most widely used types of designs for product and
process design, process improvement, and industrial/business experimentation.

---
# The one half fraction of `\(2^k\)` design

Consider a situation in which `\(3\)` factors, each at `\(2\)` levels, are of interest, but the experimenters cannot afford to
run all `\(2^k=2^3 = 8\)` treatment combinations. They can, however, afford `\(4\)` runs.

This suggests a one-half fraction of a `\(2^3\)` design.

Because the design contains `\(\frac{1}{2}*2^3 = \frac{2^3}{2^1} = 2^{3−1} = 4\)` treatment combinations, a one-half fraction of the `\(2^3\)` design is often called a `\(2^{k-1}=2^{3−1}\)` design.

---
# The one quarter fraction of `\(2^k\)` design

This design contains `\(2^{k−2}\)` runs and is usually called a `\(2^{k−2}\)` fractional factorial.

The design contains `\(\frac{1}{4}*2^3 = \frac{2^3}{2^2} = 2^{3−2} = 2\)` treatment combinations

---
# The one half fraction of `\(2^4\)` design

Steps:

- Start with a full level factorial design (all combinations) for three factors, say `\(x_1\)`, `\(x_2\)`, `\(x_3\)`
- Let the entries in the fourth column be given as a product of the first three, `\(x_4 = x_{1}x_{2}x_{3}\)`. This is called "generator". For example, `\(x_4 = -1*-1*-1=-1\)`

- The alternative design sets `\(x_4 = -x_{1}x_{2}x_{3}\)`

The two levels are denoted symbolically
by (−1) and (+1), where (−1) represents the lowest level and (+1) represents the highest level.
---
# The one half fraction of `\(2^4\)` design

<img src="figs/Screenshot_20210126_185916.png" style="width: 80%" />
---
# Confounding (Aliasing)
For 4 factors there is 1 intercept, 4 main effects, and 5 two factor interactions (11 terms).

- Not all of these terms can be distinguished in 8 runs
- We say that some terms are 'aliased' or confounded.

For example, in the 4 factor half factorial `\(x_{1}x_{2}\)` is aliased with (indistinguishable from) `\(x_{3}x_{4}\)`, as is 
`\(x_{1}x_{3}\)` with `\(x_{2}x_{4}\)`
---
# Confounding (Aliasing)

---
# Create a `\(2^{5-1}\)` factorial design 
`\(2^{5-1}\)` has 16 runs. Lets use generator E = ABCD (optional)

```r
# install.packages("FrF2")
library("FrF2")
design <- FrF2(nruns = 16, nfactors = 5, 
               generators = "ABCD", 
               randomise = TRUE)
# design.info(design)
head(design, n=4)
```

```
##    A  B  C  D  E
## 1 -1  1 -1 -1 -1
## 2  1  1 -1  1 -1
## 3  1  1  1  1  1
## 4  1 -1  1  1 -1
```
---
# Alias structure

```r
y <- runif(16, min = 0, max = 1)
aliases(lm( y ~(.)^4, data = design))
```

```
##             
##  A = B:C:D:E
##  B = A:C:D:E
##  C = A:B:D:E
##  D = A:B:C:E
##  E = A:B:C:D
##  A:B = C:D:E
##  A:C = B:D:E
##  A:D = B:C:E
##  A:E = B:C:D
##  B:C = A:D:E
##  B:D = A:C:E
##  B:E = A:C:D
##  C:D = A:B:E
##  C:E = A:B:D
##  D:E = A:B:C
```
---
# Alias structure

In this alias pattern for the `\(2^{5−1}\)` design main effects
are confounded with four-way interactions, and two-factor interactions are
confounded with three-factor interactions.

Therefore, if three and four-factor interactions could be assumed negligible, estimates of all main effects and
two-factor interactions could be made.

---
# Dry Soup Mix example (1)
A manufacturer of packaged dry soup mixes was experiencing excessive variability in
the package weights of a dry soup mix component called the “intermix.”

The intermix is a mixture of ingredients such as vegetable oil, salt, and
so forth. Too much intermix in a soup packet gives it too strong a flavor, and
not enough gives too weak a flavor.

---
# Dry Soup Mix example (2)

It was a two-step process to make the packaged soup mix.

The first step was to make a large batch of soup and dry it on a rotary dryer.

Next, the dried soup batch was placed into a mixer, where the intermix was added through ports as it was mixed.

Then it was packaged in sealed bags of uniform weight. 
---
# Dry Soup Mix example (3)

There were several factors that could be changed in the first step (production of the dried soup batch), and several
factors that could be changed in the second step (adding the intermix and mixing) that could possibly affect the variability of the weight of the intermix in each sealed bag.

A factorial experiment was to be planned to find out which factors affected variability in intermix weights.

---
# Factors and Levels for Soup Mix `\(2^{5-1}\)`

<img src="figs/Screenshot_20210126_204328.png" style="width: 90%" />
---

```r
soup <- FrF2(16, 5, generators = "ABCD", 
             factor.names = list(Ports=c(1,3), 
                                 Temp=c("Cool","Ambient"), 
                                 MixTime=c(60,80), 
                                 BatchWt=c(1500,2000), 
                                 delay=c(7,1)), 
             randomize = FALSE)

aliases(lm( y ~(.)^4, data = soup))
```

```
##                                    
##  Ports = Temp:MixTime:BatchWt:delay
##  Temp = Ports:MixTime:BatchWt:delay
##  MixTime = Ports:Temp:BatchWt:delay
##  BatchWt = Ports:Temp:MixTime:delay
##  delay = Ports:Temp:MixTime:BatchWt
##  Ports:Temp = MixTime:BatchWt:delay
##  Ports:MixTime = Temp:BatchWt:delay
##  Ports:BatchWt = Temp:MixTime:delay
##  Ports:delay = Temp:MixTime:BatchWt
##  Temp:MixTime = Ports:BatchWt:delay
##  Temp:BatchWt = Ports:MixTime:delay
##  Temp:delay = Ports:MixTime:BatchWt
##  MixTime:BatchWt = Ports:Temp:delay
##  MixTime:delay = Ports:Temp:BatchWt
##  BatchWt:delay = Ports:Temp:MixTime
```
---

```r
# response
library(DoE.base)
y <- c(1.13, 1.25, .97, 1.70, 1.47, 1.28, 1.18, .98, .78, 
       1.36, 1.85, .62, 1.09, 1.10, .76, 2.10 )
soup <- add.response( soup , y )
```
---

```r
mod1 <- lm( y ~ (.)^2, data = soup)
# summary(mod1)
```
A saturated model in the main effects and two-factor interactions is fitted. Since there are no replicates, a normal probability plot of the regression coefficients need to be used to aid in judging which effects are significant. 
---

# Summary

The `\(2^{k−p}\)` fractional factorial design are useful in screening experiments to quickly and efficiently identify the subset of factors that are active and to provide some information on interaction.