Analyzing 3 or more paired scale variables
Omnibus Test (UNDER CONSTRUCTION)
In the previous section we noticed the mean score per genre is different, but those are differences in our sample. We need to use a statistical test to find out if there will also be differences in the population, i.e. significant differences.
Unfortunately there is not just one test that could be used, and which one to use is tricky and not clearly agreed upon yet. The options that I've come across are using a multivariate analysis of variance (MANOVA), an uncorrected repeated measures ANOVA, a Greenhouse-Geisser correction (1958; 1959), or a Huynh-Feldt correction (1976). The choice for some of these depends on something known as sphericity, for which a Mauchly test can be used.
Click here for more details on this
In order to test if there are differences in means among the scale variables we first need to test for so-called sphericity, then depending on the result perform a repeated measure anova.
Sphericity has to do with variance between the different variables. A nice explanation can be found in the YouTube video from How2Stats. We need to test for this sphericity (usually done with Mauchly's test for Sphericity) because if sphericity cannot be assumed, we need to adjust the result of the repeated measure ANOVA. This is were things become tricky, since there seems to be no simple answer on how to adjust. Four options are often mentioned: Greenhouse-Geisser correction, Huynh-Feldt correction, average of those two, and using a MANOVA. A fifth option is the 'lower-bound' but I have never seen this being considered the best choice.
Box (1954a, 1954b) started the idea for the corrections but used a population covariance matrix. Greenhouse-Geisser (1958; 1959) expanded on this using the estimated population matrix. Kieffer (2002) recommends that if you have to choose between Greenhouse-Geisser and Huynh-Feldt, to go for Greenhouse-Geisser since it is more conservative: “if one must be chosen over the other, however, it is always somewhat safer to utilize the Greenhouse-Geisser e as it produces a more conservative correction factor” (p. 11). However Girden (1992 as cited in Field, 2009, p. 461) recommends the Huynh-Feldt correction if the Greenhouse-Geisser epsilon is above .75, otherwise use Greenhouse-Geisser correction. Others like Stevens (2009 p. 422? as cited in Field, 2009) recommend to simply average the two. Instead of a repeated measures ANOVA, a MANOVA could also be used. Maxwell and Delaney (1990, p. 602 as cited in Stevens, 2009, p. 427) recommend not to use this approach if n < k + 10. Baguley (2004) adds (besides the n < k + 10 criteria) that the MANOVA should only be used if epsilon is below 0.7.
For the MANOVA a few different approaches exist, but they only differ if you are going to split the data based on another variable as well (e.g. by gender). I usually see the Wilks Lamba being used, so go for that one.
Here's what I would do (but I'm definately no expert on this). Check if Greenhouse-Geisser epsilon is below 0.7 and n ≥ k + 10, if it is, use a MANOVA. If it isn't use Mauchly's test for sphericity. If it has a significance above .05 use the repeated measures without corrections. If it is below .05 then check Greenhouse-Geisser epsilon. It it is above 0.75 then use Huynh-Feldt, otherwise Greenhouse-Geisser. In Figure 1 this decision flow is visualised (click on the image for a larger one).
Click here for a brief argumentation
Maxwell and Delaney (1990, p. 602 as cited in Stevens, 2009, p. 427) recommend not to use the MANOVA if n < k + 10. Turning this around, gives one of the two conditions in my scheme to use a MANOVA if n ≥ k + 10.
Baguley (2004) adds to this condition that the MANOVA should only be used if epsilon is below 0.7.
Kieffer (2002) recommends that if you have to choose between Greenhouse-Geisser and Huynh-Feldt, to go for Greenhouse-Geisser since it is more conservative: “if one must be chosen over the other, however, it is always somewhat safer to utilize the Greenhouse-Geisser e as it produces a more conservative correction factor” (p. 11). However Girden (1992 as cited in Field, 2009, p. 461) recommends the Huynh-Feldt correction if the Greenhouse-Geisser epsilon is above .75, otherwise use Greenhouse-Geisser correction. Others like Stevens (2009 p. 422? as cited in Field, 2009), recommend to simply average the two.
In the example the Greenhouse-Geisser epsilon is 0.574, the sample size is 150, and we use 6 measures. In the example therefor the MANOVA is used with Wilks lambda, since 0.574 < 0.7 and 150 > 6 + 10.
A MANOVA indicated that there was a statistically significant difference in average score for the different genres , F (5, 145) = 79.72, p < .001; Wilk's ? = 0.267.
Using the repeated measure anova, it could look like:
Mauchly’s test indicated that the assumption of sphericity had been violated (?2(14) = 235.3, p < .001), therefore degrees of freedom for the repeated measures ANOVA were corrected using Greenhouse-Geisser estimates of sphericity (e = 0.574). The results showed there was a statistically significant difference in average score for the different genres , F (2.872, 427.940) = 374.65, p < .001.
Click here to see how to do this test with SPSS
Now that we know there are significant differences between the mean scores, it is of course also good to know which ones are different. This requires a so-called post-hoc test, which is the topic for the next section.
3+ Scale variables
Google adds