# A tibble: 2 × 6
term df sumsq meansq statistic p.value
<chr> <dbl> <dbl> <dbl> <dbl> <dbl>
1 Type 2 31739. 15869. 1.78 0.179
2 Residuals 51 455249. 8926. NA NA
# A tibble: 2 × 6
term df sumsq meansq statistic p.value
<chr> <dbl> <dbl> <dbl> <dbl> <dbl>
1 Type 2 31739. 15869. 1.78 0.179
2 Residuals 51 455249. 8926. NA NA
Calories)Randomize complete block design (RCBD)
One treatment factor with \(t\) levels, one blocking factor with \(b\) levels
Factorial design (more on this later)
Response: Firmness (force in N to pierce berry)
Treatment factor: Storage with 3 levels (\(t=3\)):
Blocking factor: Variety with 5 levels (\(b=5\)): “Allstar”, “Bounty”, “Kent”, “Selva”, “Vesper”
Stored berries were stored at \(0.5^{\circ}\)C for two days
Three clamshells of each variety randomly assigned to treatments (Storage)
\[y_{ij}=\mu+\tau_i+\rho_j+\varepsilon_{ij}\]
We expect firmness to vary according to variety (blocking factor)
However, we are interested in the effect of the air type (treatment)
We conduct a hypothesis test for the treatment variable:
Variety# A tibble: 2 × 6
term df sumsq meansq statistic p.value
<chr> <int> <dbl> <dbl> <dbl> <dbl>
1 Storage 2 11.2 5.59 0.995 0.398
2 Residuals 12 67.4 5.62 NA NA
Firmness between different Storage groupsVariety| Source of Variation | df | sumsq | meansq | statistic |
|---|---|---|---|---|
| Blocking variable | \(b-1\) | \(SS_{Blocks}\) | \(MS_{Blocks}=\frac{SS_{Blocks}}{(b-1)}\) | \(F=\frac{MS_{Blocks}}{MSE}\) |
| Treatment variable | \(t-1\) | \(SS_{Treatments}\) | \(MS_{Treatments}=\frac{SS_{Treatments}}{(t-1)}\) | \(F=\frac{MS_{Treatments}}{MSE}\) |
| Error | \((t-1)(b-1)\) | \(SSE\) | \(MSE=\frac{SSE}{(t-1)(b-1)}\) | |
| Total | \(N-1\) | \(SST\) |
# A tibble: 3 × 6
term df sumsq meansq statistic p.value
<chr> <int> <dbl> <dbl> <dbl> <dbl>
1 Variety 4 63.5 15.9 32.4 0.0000547
2 Storage 2 11.2 5.59 11.4 0.00455
3 Residuals 8 3.93 0.491 NA NA
The total variation (\(SST=78.6\)) now is split between \(SS_{Treatments}=11.2\), \(SS_{Blocks}=63.5\) and \(SSE=3.93\).
Note that the value of the F-statistic is still calculated as MSG/MSE
For example
Variety is \(15.9/0.491=32.4\)Storage is \(5.59/0.491=11.4\) Tukey multiple comparisons of means
95% family-wise confidence level
Fit: aov(formula = Firmness ~ Variety + Storage, data = strawberries)
$Storage
diff lwr upr p adj
Control-Air 0.10 -1.1659048 1.365905 0.9723992
ModifiedAir-Air 1.88 0.6140952 3.145905 0.0070490
ModifiedAir-Control 1.78 0.5140952 3.045905 0.0095586
There is a significant difference between the mean firmness for “Modified Air” and “Air” and between “Modified Air” and “Control”
Because this was a randomized experiment, we can conclude that the difference in storage method caused the difference in firmness
Storing strawberries in air enriched in CO2 increases firmness compared to not storing the berries at all or storing them in normal air
We want to assess whether there is a difference in the impact that the predatory larvae of three damselfly species (Enallagma, Lestes and Pyrrhosoma) have on the abundance of midge larvae in a pond.
We plan to conduct an experiment in which small (1 \(m^2\)) nylon mesh cages are set up in the pond.
All damselfly larvae will be removed from the cages and each cage will then be stocked with 20 individuals of one of the species.
After 3 weeks, we will sample the cages and count the density of midge larvae in each.
We have 12 cages altogether, so four replicates of each of the three species can be established.
If the cages are distributed at random (CRD) then they will cover a wide range of variation in these various factors.
These sources of variation will almost certainly cause the density of midge larvae to vary around the pond in an unpredictable way, increasing the noise in the data.
If we group sets of treatments into clusters we are creating “spatial blocks”
There may be considerable differences between blocks, but these won’t obscure differences between the treatments because all three treatments are present in every block.
damselsRows: 12
Columns: 3
$ Midge <dbl> 304, 464, 320, 578, 509, 458, 680, 740, 630, 356, 390, 350
$ Species <chr> "Enallagma", "Lestes", "Pyrrhosoma", "Enallagma", "Lestes", "P…
$ Block <chr> "A", "A", "A", "B", "B", "B", "C", "C", "C", "D", "D", "D"
Midge is the density of midge larvae (per \(m^2\)) in each enclosure, after running the experiment for 3 weeksSpecies contains info about species of damselflys (levels: Enallagma, Lestes and Pyrrhosoma)Block contains location identities (A, B, C, D).Midge density is expected to vary according to location (blocking factor)
However, we are interested in the effect of the Species of damselfly (treatment)
We conduct a hypothesis test for the treatment variable:



# A tibble: 3 × 6
term df sumsq meansq statistic p.value
<chr> <int> <dbl> <dbl> <dbl> <dbl>
1 Block 3 208425. 69475. 28.0 0.000631
2 Species 2 14904. 7452. 3.01 0.125
3 Residuals 6 14878. 2480. NA NA
[1] 0.8854944
Here, there is a significant effect of block, which says that the density of midge larvae varies across the lake.
It looks like blocking was a good idea - there is a lot of spatial (nuisance) variation in midge larvae density.
Of course what we actually care about is the damselfly species effect. This main effect term is not significant, so we conclude that we fail to reject null hypothesis and it is plausible that is no difference in the impact of the predatory larvae of three damselfly species.
While our conclusion is the same as from one-way ANOVA, we were able to better account for location variation