Summary of "Two-Way ANOVA - Full Course"

Two-way ANOVA — Summary

Main ideas and concepts

Two-way ANOVA (analysis of variance) tests how two categorical independent variables (factors) affect a single continuous dependent variable, and whether the factors interact.
It extends one-way ANOVA (one factor) to handle two factors simultaneously. If you only have one factor, use one-way ANOVA; if that single factor has two levels you could alternatively use an independent-samples t-test.
A two-way ANOVA answers three questions:
1. Is there a main effect of factor A on the dependent variable?
2. Is there a main effect of factor B on the dependent variable?
3. Is there an interaction effect between A and B (i.e., does the effect of one factor depend on the level of the other)?
ANOVA partitions the total variability of the dependent variable into components explained by factor A, factor B, the A × B interaction, and unexplained (error) variance.

Hypotheses

There are three sets of hypotheses to test:

For factor A:
- H0A: No difference among the levels of factor A (no main effect of A).
- H1A: At least one level of A differs (main effect exists).
For factor B:
- H0B: No difference among the levels of factor B (no main effect of B).
- H1B: At least one level of B differs (main effect exists).
For interaction:
- H0AB: No interaction — the effect of one factor does not depend on the other.
- H1AB: Interaction present — at least one factor modifies the effect of the other.

Assumptions to check

Normality: Data within groups (or residuals) should be approximately normally distributed — check with a Q–Q plot.
Homogeneity of variances: Group variances should be equal — check with Levene’s test.
Independence: Observations must be independent (one observation shouldn’t influence another).
Measurement scale: Dependent variable should be measured on an interval/ratio (metric) scale.

Step-by-step procedure to perform a two-way ANOVA

Define factors and levels
- Identify factor A and factor B and their levels (e.g., drug type: A/B; gender: male/female).
- Determine sample size per cell (n), number of levels P (factor A) and Q (factor B). Total N = n × P × Q.
Check assumptions
- Plot residuals on a Q–Q plot (normality).
- Run Levene’s test for homogeneity of variances.
- Ensure study design supports independence.
- Confirm dependent variable is metric.
Compute means
- Group means: mean for each cell (combination of A and B).
- Marginal means: mean for each level of A (averaged across B) and for each level of B (averaged across A).
- Grand mean: mean across all observations.
Compute sums of squares (SS)
- SS_total = sum over all observations (X_ijk − grand mean)^2.
- SS_between_groups = sum over groups n × (group mean − grand mean)^2. (This equals SS_A + SS_B + SS_AB.)
- SS_A = Q × n × sum over i (mean_i. − grand mean)^2 (variation due to factor A).
- SS_B = P × n × sum over j (mean_.j − grand mean)^2 (variation due to factor B).
- SS_AB = SS_between_groups − SS_A − SS_B (interaction).
- SS_error (residual) = sum over groups sum over observations in group (X_ijk − group mean)^2.
Compute degrees of freedom (df)
- df_total = N − 1.
- df_A = P − 1.
- df_B = Q − 1.
- df_AB = (P − 1)(Q − 1).
- df_error = P × Q × (n − 1) = N − P×Q.
Compute mean squares (MS)
- MS_factor = SS_factor / df_factor for A, B, and AB.
- MS_error = SS_error / df_error.
Compute F-statistics
- F_A = MS_A / MS_error
- F_B = MS_B / MS_error
- F_AB = MS_AB / MS_error
Obtain p-values
- Use the F-distribution with the appropriate df to get p-values (from tables or software).
Decision rule
- For each test (A, B, AB), if p < α (commonly .05), reject H0 for that effect; otherwise do not reject H0.
Post-hoc or follow-up
- If a main effect is significant with more than two levels, perform post-hoc multiple comparisons to find where differences lie.
- If interaction is significant, interpret the interaction first (it can change the meaning of main effects) and examine simple effects as needed.

Worked example

Design: two factors — drug type (A, B) and gender (male, female). n = 5 per cell → N = 20.

Reported results:

Grand mean = 5.4
SS_total = 84.8; df_total = 19 → MS_total = 84.8 / 19 ≈ 4.46
SS_between_groups = 7.6; df_between = 3 → MS_between = 7.6 / 3 ≈ 2.53
SS_A = 5.0; df_A = 1 → MS_A = 5.0
SS_B = 0.8; df_B = 1 → MS_B = 0.8
SS_AB = 1.8; df_AB = 1 → MS_AB = 1.8
SS_error = 77.2; df_error = 16 → MS_error = 77.2 / 16 ≈ 4.83
F values:
- F_A = 5.0 / 4.83 ≈ 1.04
- F_B ≈ 0.17
- F_AB ≈ 0.37

Interpretation: all p-values > .05 ⇒ none of the three null hypotheses are rejected. There are no significant main effects of drug type or gender, and no significant interaction.

How to run it in software

Example tool: DataTab (data.net)
- Paste data into a table.
- Navigate to hypothesis tests → select variables.
- The tool returns a two-way ANOVA table, Levene’s test, descriptive statistics, assumption checks, and a plain-language summary.
You can also compute critical F values from F-distribution tables for manual comparisons, but most analysts use software to get p-values.

Why “variance” matters

ANOVA decomposes the total sum of squares (total variability) into components attributable to each factor, their interaction, and unexplained (error) variance. This lets you see how much variability each term explains and test whether those portions are large relative to residual variance.

Speakers / sources

Narrator / Instructor presenting the lesson.
DataTab (online statistics tool referenced; appears as “data tab” / data.net).
F-distribution / critical F-value tables (referenced as a resource for manual p-value comparison).