# Analysing a nominal and scale variable

## Part 3a: Test for mean differences

From the sample data we would like to know if the differences in means might also appear in the population. In the example used so far we saw that there were differences in averages between the three locations, but this could simply be due to sampling error. The test to see if these differences might also occur in the population is the one-way ANOVA.

If the test results in a significance (p-value) of less than the pre-determined significance level (usually .05), the nominal variable has an effect on the scale variable and most likely the means for one or more categories will be different in the population from one or more other categories.

In the example the significance is 0.001, which is the chance of having a sample with a F value of 8.043 or even higher, if in the population there would be no differences between the three groups. Since this chance is so low (below 0.050), we can conclude that most likely in the population there will be a an influence of the location on the grade students gave.

In formal APA style we can report this result as:

The one-way ANOVA showed that Location had significant effect on how students evaluated the course, *F*(2, 45) = 8.04, *p *= .001.

**Click here to see how to perform a one-way ANOVA, with SPSS, R (studio), Excel, Python, or Manually.**

**with SPSS**

Two ways to perform a One-way ANOVA with SPSS:

**with R (Studio)**

Click on the thumbnail below to see where to look in the output.

R script: TS - Fisher one-way ANOVA.R

Data file: StudentStatistics.csv.

**with Excel**

you can either use the Data Analysis add in from Excel to get static results (i.e. changes in the data will not be reflected in the results), or use the Excel functions.

*using built in functions*

*using Data Analysis add-in*

**Manually (Formulas and example)**

**Formulas**

The formula for the one-way ANOVA test statistic is:

\(F=\frac{MS_{\textup{between}}}{MS_{\textup{within}}}\)

In this formula *MS* is short for Mean Square, which is the mean of the squared deviations. There formulas are:

\(MS_{\textup{between}}=\frac{SS_{\textup{between}}}{df_{\textup{between}}}\)

\(MS_{\textup{within}}=\frac{SS_{\textup{within}}}{df_{\textup{within}}}\)

In these formulas, the *SS* is short for Sum of Squares, which in turn is short for Sum of Squared deviations from the mean, and *df* is short for degrees of freedom.

The formulas for the degrees of freedom are:

\(df_{\textup{between}}=k-1\)

\(df_{\textup{within}}=n-k\)

In these formulas *k* is the number of categories, and *n* the total sample size.

The formulas for the sum of squares is:

\(SS_{\textup{between}}=\sum_{i=1}^{k}n_i\times(\bar{x}_i-\bar{x})^2\)

\(SS_{\textup{within}}=\sum_{i=1}^{k}\sum_{j=1}^{n_i}(x_{i,j}-\bar{x}_i)^2\)

In these formulas \(n_{i}\) is the number of scores in category *i*, \(\bar{x}\) the mean of all scores, \(\bar{x}_i\) the mean of the scores in the *i*-th category, and \(x_{i,j}\) the *j*-th score in the *i*-th category.

Formulas for the means are:

\(\bar{x}=\frac{\sum_{i=1}^nx_i}{n}\)

\(\bar{x}_i=\frac{\sum_{j=1}^{n_i}x_{i,j}}{n_i}\)

Where \(x_i\) is the *i*-th score.

**Example**

*Note*: different example than used in the rest of this section.

We are given grades people gave to a brand, and grouped the people in three categories (local, regional, outside). The grades were:

\(X_1=(2,4,8,6), X_2=(9,5,7), X_3=(1,7,7,6,10)\)

The first category has 4 scores, the second 3, and the third 5. Therefore:

\(k=3,n_1=4, n_2=3, n_3=5,n=4+3+5=12\)

Lets begin with the overall mean:

\(\bar{x}=\frac{\sum_{i=1}^nx_i}{n} =\frac{\sum_{i=1}^{12}x_i}{12} =\frac{2+4+8+6+9+5+7+1+7+7+6+10}{12}\)

\(=\frac{72}{12}=6\)

The means for each category are:

\(\bar{x}_1 =\frac{\sum_{j=1}^{n_1}x_{i,1}}{n_1} =\frac{\sum_{j=1}^{4}x_{i,1}}{4} =\frac{2+4+8+6}{4} =\frac{20}{4} =5\)

\(\bar{x}_2 =\frac{\sum_{j=1}^{n_2}x_{i,2}}{n_2} =\frac{\sum_{j=1}^{3}x_{i,2}}{3} =\frac{9+5+7}{3} =\frac{21}{3} =7\)

\(\bar{x}_3 =\frac{\sum_{j=1}^{n_3}x_{i,3}}{n_3} =\frac{\sum_{j=1}^{5}x_{i,3}}{5} =\frac{1+7+7+6+10}{3} =\frac{31}{5} =6.2\)

Now for the SS within:

\(SS_{\textup{within}}=\sum_{i=1}^{k}\sum_{j=1}^{n_i}(x_{i,j}-\bar{x}_i)^2 =\sum_{i=1}^{3}\sum_{j=1}^{n_i}(x_{i,j}-\bar{x}_i)^2 =\sum_{i=1}^{3}\sum_{j=1}^{n_i}(x_{i,j}-\bar{x}_i)^2\)

\(=\sum_{j=1}^{n_1}(x_{1,j}-\bar{x}_1)^2+\sum_{j=1}^{n_2}(x_{2,j}-\bar{x}_2)^2+\sum_{j=1}^{n_3}(x_{3,j}-\bar{x}_3)^2\)

Lets do these sums one by one:

\(\sum_{j=1}^{n_1}(x_{1,j}-\bar{x}_1)^2 =\sum_{j=1}^{4}(x_{1,j}-5)^2\)

\(=(2-5)^2+(4-5)^2+(8-5)^2+(6-5)^2\)

\(=(-3)^2+(-1)^2+(3)^2+(1)^2 =9+1+9+1=20\)

\(\sum_{j=1}^{n_2}(x_{2,j}-\bar{x}_2)^2 =\sum_{j=1}^{3}(x_{2,j}-7)^2\)

\(=(9-7)^2+(5-7)^2+(7-7)^2 =(2)^2+(-2)^2+(0)^2\)

\(=4+4+0=8\)

\(\sum_{j=1}^{n_3}(x_{3,j}-\bar{x}_3)^2 =\sum_{j=1}^{5}\left(x_{3,j}-\frac{31}{5}\right)^2\)

\(=\left(1-\frac{31}{5}\right)^2+\left(7-\frac{31}{5}\right)^2+\left(7-\frac{31}{5}\right)^2+\left(6-\frac{31}{5}\right)^2+\left(10-\frac{31}{5}\right)^2\)

\(=\left(\frac{5}{5}-\frac{31}{5}\right)^2+\left(\frac{35}{5}-\frac{31}{5}\right)^2+\left(\frac{35}{5}-\frac{31}{5}\right)^2+\left(\frac{30}{5}-\frac{31}{5}\right)^2+\left(\frac{50}{5}-\frac{31}{5}\right)^2\)

\(=\left(\frac{5-31}{5}\right)^2+\left(\frac{35-31}{5}\right)^2+\left(\frac{35-31}{5}\right)^2+\left(\frac{30-31}{5}\right)^2+\left(\frac{50-31}{5}\right)^2\)

\(=\left(\frac{-26}{5}\right)^2+\left(\frac{4}{5}\right)^2+\left(\frac{4}{5}\right)^2+\left(\frac{1}{5}\right)^2+\left(\frac{19}{5}\right)^2\)

\(=\frac{\left(-26\right)^2}{5^2}+\frac{\left(4\right)^2}{5^2}+\frac{\left(4\right)^2}{5^2}+\frac{\left(1\right)^2}{5^2}+\frac{\left(19\right)^2}{5^2}\)

\(=\frac{676}{5^2}+\frac{16}{5^2}+\frac{16}{5^2}+\frac{1}{5^2}+\frac{361}{5^2}\)

\(=\frac{676+16+16+1+361}{5^2} =\frac{1070}{5^2} =\frac{214\times5}{5^2} =\frac{214}{5} =42.8\)

Using these three results we can determine the SS within:

\(SS_{\textup{within}}=20+8+\frac{214}{5}=\frac{100}{5}+\frac{40}{5}+\frac{214}{5} =\frac{100+40+214}{5} =\frac{354}{5} =70.8\)

Now for the SS between.

\(SS_{\textup{between}}=\sum_{i=1}^{k}n_i\times(\bar{x}_i-\bar{x})^2 =\sum_{i=1}^{3}n_i\times(\bar{x}_i-6)^2\)

\(=n_1\times(\bar{x}_1-6)^2+n_2\times(\bar{x}_2-6)^2+n_3\times(\bar{x}_2-6)^2\)

\(=4\times(5-6)^2+3\times(7-6)^2+5\times\left(\frac{31}{5}-6\right)^2\)

\(=4\times(-1)^2+3\times(1)^2+5\times\left(\frac{31}{5}-\frac{30}{5}\right)^2\)

\(=4\times1+3\times1+5\times\left(\frac{31-30}{5}\right)^2 =4+3+5\times\left(\frac{1}{5}\right)^2\)

\(=7+5\times\frac{1^2}{5^2} =7+\frac{5\times1^2}{5^2} =7+\frac{1^2}{5} =7+\frac{1}{5}\)

\(=\frac{35}{5}+\frac{1}{5} =\frac{35+1}{5} =\frac{36}{5} =7.2\)

Then the degrees of freedom:

\(df_{\textup{between}}=k-1=3-1=2\)

\(df_{\textup{within}}=n-k=12-3=9\)

We can now determine the Mean Square:

\(MS_{\textup{between}} =\frac{SS_{\textup{between}}}{df_{\textup{between}}} =\frac{\frac{36}{5}}{2} =\frac{36}{5\times2} =\frac{18\times2}{5\times2} =\frac{18}{5} =3.6\)

\(MS_{\textup{within}} =\frac{SS_{\textup{within}}}{df_{\textup{within}}} =\frac{\frac{354}{5}}{9} =\frac{354}{5\times9} =\frac{118\times3}{5\times3\times3} =\frac{118}{5\times3} =\frac{118}{15} \approx7.87\)

Finally the F-statistic:

\(F=\frac{MS_{\textup{between}}}{MS_{\textup{within}}} =\frac{\frac{18}{5}}{\frac{118}{15}} =\frac{18\times15}{118\times5} =\frac{2\times9\times3\times5}{2\times59\times5}\)

\(=\frac{9\times3}{59} =\frac{27}{59} \approx0.4576\)

The test only shows there is an effect, but does not show which locations are significantly different from each other. To find this out, we should use a so-called post-hoc test, which will be the topic for the next page.

Note that the regular/classical/Fisher one-way ANOVA is often not recommended. There are many alternatives, for example a Welch one-way ANOVA, or Brown-Forsythe test for means. Click below for more information on these alternatives.

**Alternatives for the classic one-way ANOVA**

**Which one to choose?**

The classic/Fisher one-way ANOVA assumes the data is normally distributed and that the variances in each group are the same in the population (homoscedasticity). Many have tried to cover the situations when one or both of these conditions are not met.

Delacre et al. (2019) recommend to use the **Welch ANOVA **instead of the classic and Brown-Forsythe versions. How2stats (2018) give a slightly different recommendation based on Tomarken and Serlin (1986). They agree that usually the Welch ANOVA is preferred of the classic version, but if the average sample size is below six to still use the **Brown-Forsythe**.

The researchers in the previous paragraph did not take into consideration other approaches. A few comments found on those other methods.

According to Hartung et al. (2002, p. 225) the **Cochran test** is the standard test in meta-analysis, but should not be used, since it is always too liberal.

Schneider and Penfield (1997) looked at the Welch, **Alexander-Govern** and the **James test** (they ignored the Brown-Forsythe since they found it to perform worse than Welch or James), and concluded: “Under variance heterogeneity, Alexander-Govern’s approximation was not only comparable to the Welch test and the James second-order test but was superior, in certain instances, when coupled with the power results for those tests” (p. 285).

Cavus and Yazici (2020) compared many different tests. They showed that the Brown-Forsythe, Box correction, Cochran, Hartung-Agac-Makabi adjusted Welch, and Scott-Smith test, all do not perform well, compared to the Asiribo-Gurland correction, Alexander-Govern test, Özdemir-Kurt B2, Mehrotra modified Brown-Forsythe, and Welch.

I only came across the **Johansen test** in Algina et. al. (1991) and it appears to give the same results as the Welch test.

In my experience the one-way ANOVA is widely known and often discussed in textbooks. The Welch anova is gaining popularity. The Brown-Forsythe is already more obscure and some confuse it with the Brown-Forsythe test for variances. The James test and the Alexander-Govern are perhaps the least known and the Johansen even less than that (at least they were for me). So, although the Alexander-Govern test might be preferred over the Welch test, some researchers prefer to use a more commonly used test than a more obscure version. In the end it is up to you to decide on what might be the best test, and also depending on the importance of your research you might want to investigate which test fits your situation best, rather than taking my word for it.

Besides these, there are more methods, some using simulation (bootstrapping) (see Cavus and Yazici (2020) for a few of them), others using different techniques (see Yiğit and Gökpinar (2010) for a few more methods not in here).

**Cochran test**

*with Excel*

Excel file: TS - Test for Means.xlsm

*SPSS (not possible)*

Unfortunately, it is to my knowledge not possible to perform this test with SPSS Statistics (, perhaps using SPSS syntax or the R-plugin it could be done).

*Formulas*

This test is the basis for also the Welch and James test.

\( \chi_{Cochran}^2=\sum_{j=1}^k w_j\times\left(\bar{x}_j - \bar{y}_w\right)^2 \)

\( df = k - 1\)

\( \chi_{Cochran}^2\sim \chi^2\left(df\right) \)

With:

\( \bar{y}_w = \sum_{j=1}^k h_j\times \bar{x}_j\)

\( h_j = \frac{w_j}{w}\)

\( w_j = \frac{n_j}{s_j^2}\)

\( w = \sum_{j=1}^k w_j\)

\( \bar{x}_j = \frac{\sum_{j=1}^{n_j} x_{i,j}}{n_j}\)

\( s_j^2 = \frac{\sum_{i=1}^{n_j} \left(x_{i,j} - \bar{x}_j\right)^2}{n_j - 1}\)

The symbols used are \(x_i,j\) for the i-th score in category j, \(n_j\) the sample size of category j, \(\bar{x}_j\) the sample mean of category j, \(s_j^2\) the sample variance of the scores in category j, \(w_j\) the weight for category j, \(h_j\) the adjusted weight for category j, \(df\) the degrees of freedom.

The original article should be from Cochran (1937), but unfortunately I couldn't really find the formula's in there. The formulas shown are based on Cavus and Yazici (2020, p. 5), Hartung et al. (2002, p. 202) and also Mezui-Mbeng (2015, p. 787). These all show the same formula as shown above.

**Welch ANOVA**

*with Excel*

Excel file: TS - Test for Means.xlsm

*with SPSS*

Data file: StudentStatistics.sav.

*Formulas*

This is an adjustment on the Cochran test, proposed by Welch (1947, 1951). It is also referred to as the Welch-Aspin test, since Welch also used work from Aspin (Aspin, 1948; Aspin & Welch, 1949). The formulas can be found on page 330, 334, and 335 of the article from Welch (1951).

\( F_{Welch} = \frac{\frac{1}{k-1}\times\sum_{j=1}^k w_j\times\left(\bar{x}_j - \bar{y}_w\right)^2}{1 + \frac{2\times\left(k-2\right)}{k^2-1}\times \lambda}\)

\( = \frac{\chi_{Cochran}^2}{1k-1 + \frac{2\times\left(k-2\right)}{k^2+1}\times \lambda}\)

\( \chi_{Cochran}^2=\sum_{j=1}^k w_j\times\left(\bar{x}_j - \bar{y}_w\right)^2 \)

\( df_1 = k - 1\)

\( df_2 = \frac{k^2-1}{3\times\lambda}\)

\( F_{Welch}\sim F\left(df_1, df_2\right)\)

With:

\( \bar{y}_w = \sum_{j=1}^k h_j\times \bar{x}_j\)

\( h_j = \frac{w_j}{w}\)

\( w_j = \frac{n_j}{s_j^2}\)

\( w = \sum_{j=1}^k w_j\)

\( \bar{x}_j = \frac{\sum_{j=1}^{n_j} x_{i,j}}{n_j}\)

\( s_j^2 = \frac{\sum_{i=1}^{n_j} \left(x_{i,j} - \bar{x}_j\right)^2}{n_j - 1}\)

\( \lambda = \sum_{j=1}^k \frac{\left(1 - h_j\right)^2}{n_j - 1}\)

The symbols used are \(k\) for the number of categories,\(x_i,j\) for the i-th score in category j, \(n_j\) the sample size of category j, \(\bar{x}_j\) the sample mean of category j, \(s_j^2\) the sample variance of the scores in category j, \(w_j\) the weight for category j, \(h_j\) the adjusted weight for category j, \(df_i\) i-th degrees of freedom, \(\chi_{Cochran}^2\) the test statistic from the Cochran test.

**James Test**

*with Python*

video to be uploaded

Jupyter Notebook: TS - James Test.ipynb

Data file: StudentStatistics.csv.

*SPSS (not possible)*

Unfortunately, it is to my knowledge not possible to perform this test with SPSS Statistics (, perhaps using SPSS syntax or the R-plugin it could be done).

*Formulas*

James (1951) proposed three tests, one for large group sizes, a 'first order test', and a 'second order test'. The later two a significance level (α) is chosen and a critical value is then calculated based on a modification of the chi-square distribution.

The James test statistic value J is the same as the test statistic in Cochran's test, calculated slightly different, but will lead to the same result.

\( J = \sum_{j=1}^k w_j\times\bar{x}_j^2 - \bar{y}_w^* = \chi_{Cochran}^2\)

With:

\( w_j = \frac{n_j}{s_j^2}\)

\( w = \sum_{j=1}^k w_j\)

\( h_j = \frac{w_j}{w}\)

\( \bar{y}_w^* = \frac{\left(\sum_{j=1}^k w_j\times \bar{x}_j\right)^2}{w}\)

\( \bar{x}_j = \frac{\sum_{j=1}^{n_j} x_{i,j}}{n_j}\)

\( s_j^2 = \frac{\sum_{i=1}^{n_j} \left(x_{i,j} - \bar{x}_j\right)^2}{n_j - 1}\)

\( df = k - 1 \)

For large group sizes:

\( J\sim\chi^2\left(df\right) \)

First-order James test

Reject null hypothesis if \(J > J_{crit1} \)

\( J_{crit1} = \chi_{crit}^2\times\left(1 + \frac{3\times \chi_{crit}^2 + k + 1}{2\times\left(k^2 - 1\right)}\times \lambda\right) \)

\( \lambda = \sum_{j=1}^k \frac{\left(1 - h_j\right)^2}{v_j} \)

\( v_j = n_j - 1 \)

\( \chi_{crit}^2 = Q\left(\chi^2\left(1 - \alpha, df\right)\right) \)

Second-order James test

Reject null hypothesis if \(J > J_{crit2} \)

\( J_{crit2} = \sum_{r=1}^9 a_r \)

\( a_1 = \chi_{crit}^2 + \frac{1}{2}\times\left(3\times\chi_4 + \chi_2\right)\times\ \lambda_2 \)

\( a_2 = \frac{1}{16}\times\left(3\times\chi_4 + \chi_2\right)^2\times\left(1 - \frac{k - 3}{\chi_{crit}^2}\right)\times\ \lambda_2^2\)

\( a_{3f} = \frac{1}{2}\times\left(3\times\chi_4+\chi_2\right)\)

\( a_{3a} = 8\times R_{23} - 10\times R_{22} + 4\times R_{21} - 6\times R_{12}^2 + 8\times R_{12}\times R_{11} - 4\times R_{11}^2 \)

\( a_{3b} = \left(2\times R_{23} - 4\times R_{22} + 2\times R_{21} - 2\times R_{12}^2 + 4\times R_{12}\times R_{11} - 2\times R_{11}^2\right)\times\left(\chi_2-1\right) \)

\( a_{3c} = \frac{1}{4}\times\left(-R_{12}^2 + 4\times R_{12}\times R_{11} - 2\times R_{12}\times R_{10} - 4\times R_{11}^2 + 4\times R_{11}\times R_{10} - R_{10}^2\right)\times\left(3\times\chi_4 - 2\times\chi_2 - 1\right) \)

\( a_3 = a_{3f}\times\left(a_{3a} + a_{3b} + a_{3c} \right) \)

\( a_4 = \left(R_{23} - 3\times R_{22} + 3\times R_{21} - R_{20}\right)\times\left(5\times \chi_6 + 2\times\chi_4 + \chi_2\right) \)

\( a_5 = \frac{3}{16}\times\left(R_{12}^2 - 4\times R_{23} + 6\times R_{22} - 4\times R_{21} + R_{20}\right)\times\left(35\times\chi_8 + 15\times\chi_6 + 9\times\chi_4 + 5\times\chi_2\right) \)

\( a_6 = \frac{1}{16}\times\left(-2\times R_{22}^2 + 4\times R_{21} - R_{20} + 2\times R_{12}\times R_{10} - 4\times R_{11}\times R_{10} + R_{10}^2\right)\times\left(9\times\chi_8 - 3\times\chi_6 - 5\times\chi_4 - \chi_2\right) \)

\( a_7 = \frac{1}{16}\times\left(-2\times R_{22}^2 + 4\times R_{21} - R_20 + 2\times R_{12}\times R_{10} - 4\times R_{11}\times R_{10} + R_{10}^2\right)\times\left(9\times\chi_8 - 3\times\chi_6 - 5\times\chi_4 - \chi_2\right) \)

\( a_8 = \frac{1}{4}\times\left(-R_{22} + R_{11}^2\right)\times\left(27\times\chi_8 + 3\times\chi_6 + \chi_4 + \chi_2\right) \)

\( a_9 = \frac{1}{4}\times\left(R_{23} - R_{12}\times R_{11}\right)\times\left(45\times\chi_8 + 9\times\chi_6 + 7\times\chi_4 + 3\times\chi_2\right) \)

\( \lambda_2 = \sum_{j=1}^k \frac{\left(1 - h_j\right)^2}{v_j^*} \)

\( v_j^* = n_j - 2 \)

\( R_{xy} = \sum_{j=1}^k \frac{1}{\left(v_j^*\right)^x}\times\left(\frac{w_j}{w}\right)^y \)

\( \chi_{2\times r} = \frac{\left(\chi_{crit}^2\right)^r}{\prod_{i=1}^r \left(k + 2\times i - 3\right)} \)

\( \chi_{crit}^2 = \chi_{crit}^2\left(1 - \alpha, df\right) \)

The symbols used are \(k\) for the number of categories,\(x_i,j\) for the i-th score in category j, \(n_j\) the sample size of category j, \(\bar{x}_j\) the sample mean of category j, \(s_j^2\) the sample variance of the scores in category j, \(w_j\) the weight for category j, \(df\) the degrees of freedom, \(Q\left(x\right)\) the quantile function (= inverse cumulative distribution = percentile function = percent-point function).

Note that the formula \(v_j^* = n_j-2\) for the second-order test is based on: "..., we finally obtain, as the approximation of order -2 in the \(v_i\),..." (James, 1951, p. 328). It can als be found in Deshon and Alexander (1994, p. 331). However, other authors use \(v_j = n_j - 1\) in the calculation, for example Myers (1998, p. 209) and Cribbie et al. (2002, p. 62).

**Box Correction**

*with Python*

video to be uploaded

Jupyter Notebook: TS - Box Adjustment.ipynb

Data file: StudentStatistics.csv.

*SPSS (not possible)*

Unfortunately, it is to my knowledge not possible to perform this test with SPSS Statistics (, perhaps using SPSS syntax or the R-plugin it could be done).

*Formulas*

A proposal to correct the original F-statistic by a specific factor from Box (1954) with also adjusted degrees of freedom:

\( F_{Box} = \frac{F}{c} \)

\( df_1^* = \frac{\left(\sum_{j=1}^k\left(n-n_j\right)\times s_j^2\right)^2}{\left(\sum_{j=1}^k n_j\times s_j^2\right)^2 + n\times\sum_{j=1}^k\left(n - 2\times n_j\right)\times s_j^4} \)

\( df_2^* = \frac{\left(\sum_{j=1}^k \left(n_j-1\right)\times s_j^2\right)^2}{\sum_{j=1}^k\left(n_j-1\right)\times s_j^4}\)

\( F_{Box} \sim F\left(df_1^*, df_2^*\right) \)

With:

\( c = \frac{n-k}{n\times\left(k-1\right)}\times\frac{\sum_{j=1}^k\left(n-n_j\right)\times s_j^2}{\sum_{j=1}^k\left(n_j-1\right)\times s_j^2} \)

\( s_j^2 = \frac{\sum_{i=1}^{n_j} \left(x_{i,j} - \bar{x}_j\right)^2}{n_j - 1}\)

\( \bar{x}_j = \frac{\sum_{j=1}^{n_j} x_{i,j}}{n_j}\)

The symbols used are \(k\) for the number of categories,\(x_i,j\) for the i-th score in category j, \(n_j\) the sample size of category j, \(\bar{x}_j\) the sample mean of category j, \(s_j^2\) the sample variance of the scores in category j, \(w_j\) the weight for category j, \(df_i^*\) the i-th adjusted degrees of freedom, \(F\) is the F-statistic from the regular one-way ANOVA

The \(F_{Box}\) value is the same as the one of the Brown-Forsythe test for means. The R functions in the doex and onewaytests library actually use this. They also have a different formula for the 2nd degrees of freedom, which leads to a different result:

\( df_2^* = \frac{\left(\sum_{j=1}^k\left(1 - \frac{n_j}{n}\right)\times s_j^2\right)^2}{\frac{\sum_{j=1}^k\left(1 - \frac{n_j}{n}\right)^2\times s_j^4}{n-k}} \)

**Scott-Smith ANOVA**

*with Python*

video to be uploaded

Jupyter Notebook: TS - Scott-Smith.ipynb

Data file: StudentStatistics.csv.

*SPSS (not possible)*

*Formulas*

Scott and Smith (1971) solution was to use the following:

\( \chi_{SS}^2 = \sum_{j=1}^k z_j^2 \)

\( df = k \)

\( \chi_{SS}^2 \sim \chi^2\left(df\right) \)

With:

\( z_j = t_j\times\sqrt{\frac{n_j-3}{n_j-1}} \)

\( t_j = \frac{\bar{x}_j - \bar{x}}{\sqrt{\frac{s_j^2}{n_j}}} \)

\( s_j^2 = \frac{\sum_{i=1}^{n_j} \left(x_{i,j} - \bar{x}_j\right)^2}{n_j - 1}\)

\( \bar{x}_j = \frac{\sum_{j=1}^{n_j} x_{i,j}}{n_j}\)

\( \bar{x} = \frac{\sum_{j=1}^{n_j}n_j\times \bar{x}_j}{n} = \frac{\sum_{j=1}^{k}\sum_{i=1}^{n_j} x_{i,j}}{n}\)

The symbols used are \(k\) for the number of categories,\(x_i,j\) for the i-th score in category j, \(n_j\) the sample size of category j, \(\bar{x}_j\) the sample mean of category j, \(s_j^2\) the sample variance of the scores in category j, \(\bar{x}\) the sample mean of all scores, \(s_j\) the sample standard deviation of the scores in category j, \(n\) the total sample size, \(df\) the degrees of freedom.

**Brown-Forsythe Test for Means**

*with Python*

video to be uploaded

Jupyter Notebook: TS - Brown-Forsythe Means.ipynb

Data file: StudentStatistics.csv.

*with R (studio)*

video to be uploaded

R script: TS - Brown-Forsythe Means.R

Data file: StudentStatistics.csv.

*Formulas*

The Brown-Forsythe Test for means (Brown & Forsythe, 1974) appears to give the same results as the Box correction, and only differ in the value of the second degrees of freedom.

\( F_{BF} = \frac{\sum_{j=1}^k n_j\times\left(\bar{x}_j - \bar{x}\right)^2}{\sum_{j=1}^k\left(1-\frac{n_j}{n}\right)\times s_j^2} \)

\( df_1 = k - 1 \)

\( df_2 = \frac{\left(\sum_{j=1}^k\left(1-\frac{n_j}{n}\right)\times s_j^2\right)^2}{\sum_{j=1}^k \frac{\left(1-\frac{n_j}{n}\right)^2\times s_j^4}{n_j - 1}} \)

\( F_{BF}\sim F\left(df_1, df_2\right) \)

With:

\( s_j^2 = \frac{\sum_{i=1}^{n_j} \left(x_{i,j} - \bar{x}_j\right)^2}{n_j - 1}\)

\( \bar{x}_j = \frac{\sum_{j=1}^{n_j} x_{i,j}}{n_j}\)

\( \bar{x} = \frac{\sum_{j=1}^{n_j}n_j\times \bar{x}_j}{n} = \frac{\sum_{j=1}^{k}\sum_{i=1}^{n_j} x_{i,j}}{n}\)

\( n = \sum_{j=1}^k n_j \)

The symbols used are \(k\) for the number of categories, \(x_i,j\) for the i-th score in category j, \(n_j\) the sample size of category j, \(\bar{x}_j\) the sample mean of category j, \(s_j^2\) the sample variance of the scores in category j, \(n\) the total sample size, \(df_i\) the i-th degrees of freedom.

**Alexander-Govern Test for Means**

*with Python*

video to be uploaded

Jupyter Notebook: TS - Alexander-Govern.ipynb

Data file: StudentStatistics.csv.

*with R (studio)*

video to be uploaded

R script: TS - Alexander-Govern Test.R

Data file: StudentStatistics.csv.

*SPSS (not possible)*

*Formulas*

Alexander and Govern (1994) proposed the following:

\( A = \sum_{j=1}^k z_j^2 \)

\( df = k - 1 \)

\( A \sim \chi^2\left(df\right) \)

With:

\( z_j = c_j + \frac{c_j^3 + 3\times c_j}{b_j} - \frac{4\times c_j^7 + 33\times c_j^5 + 240\times c_j^3 + 855\times c_j}{10\times b_j^2 + 8\times b_j\times c_j^4 + 1000\times b_j} \)

\( c_j = \sqrt{a_j\times\ln\left(1 + \frac{t_j^2}{n_j - 1}\right)} \)

\( b_j = 48\times a_j^2 \)

\( a_j = n_j - 1.5 \)

\( t_j = \frac{\bar{x}_j - \bar{y}_w}{\sqrt{\frac{s_j^2}{n_j}}} \)

\( \bar{y}_w = \sum_{j=1}^k h_j\times \bar{x}_j\)

\( h_j = \frac{w_j}{w}\)

\( w_j = \frac{n_j}{s_j^2}\)

\( w = \sum_{j=1}^k w_j\)

\( s_j^2 = \frac{\sum_{i=1}^{n_j} \left(x_{i,j} - \bar{x}_j\right)^2}{n_j - 1}\)

\( \bar{x}_j = \frac{\sum_{j=1}^{n_j} x_{i,j}}{n_j}\)

The symbols used are \(k\) for the number of categories, \(x_i,j\) for the i-th score in category j, \(n_j\) the sample size of category j, \(\bar{x}_j\) the sample mean of category j, \(s_j^2\) the sample variance of the scores in category j, \(n\) the total sample size, \(df\) the degrees of freedom.

**Mehrotra modified Brown-Forsythe Test for Means**

*with Python*

video to be uploaded

Jupyter Notebook: TS - Mehrotra.ipynb

Data file: StudentStatistics.csv.

*with R (studio)*

video to be uploaded

R script: TS - Mehrotra-Brown-Forsythe Means.R

Data file: StudentStatistics.csv.

*SPSS (not possible)*

*Formulas*

Mehrotra (1997) modified the calculation for the first degrees of freedom in the Brown-Forsythe test for means, all other values are the same.

\( F_{MBF} = \frac{\sum_{j=1}^k n_j\times\left(\bar{x}_j - \bar{x}\right)^2}{\sum_{j=1}^k\left(1-\frac{n_j}{n}\right)\times s_j^2} \)

\( df_1^* = \frac{\left(\sum_{j=1}^k s_j^2 - \frac{\sum_{j=1}^k n_j\times s_j^2}{n}\right)^2}{\sum_{j=1}^k s_j^4 + \left(\frac{\sum_{j=1}^k n_j\times s_j^2}{n}\right)^2 - 2\times\frac{\sum_{j=1}^k n_j \times s_j^4}{n}} \)

\( df_2 = \frac{\left(\sum_{j=1}^k\left(1-\frac{n_j}{n}\right)\times s_j^2\right)^2}{\sum_{j=1}^k \frac{\left(1-\frac{n_j}{n}\right)^2\times s_j^4}{n_j - 1}} \)

\( F_{MBF}\sim F\left(df_1^*, df_2\right) \)

With:

\( s_j^2 = \frac{\sum_{i=1}^{n_j} \left(x_{i,j} - \bar{x}_j\right)^2}{n_j - 1}\)

\( \bar{x}_j = \frac{\sum_{j=1}^{n_j} x_{i,j}}{n_j}\)

\( \bar{x} = \frac{\sum_{j=1}^{n_j}n_j\times \bar{x}_j}{n} = \frac{\sum_{j=1}^{k}\sum_{i=1}^{n_j} x_{i,j}}{n}\)

\( n = \sum_{j=1}^k n_j \)

The symbols used are \(k\) for the number of categories, \(x_i,j\) for the i-th score in category j, \(n_j\) the sample size of category j, \(\bar{x}_j\) the sample mean of category j, \(s_j^2\) the sample variance of the scores in category j, \(n\) the total sample size, \(df_i\) the i-th degrees of freedom.

**Hartung-Agac-Makabi Adjusted Welch ANOVA**

*with Python*

video to be uploaded

Jupyter Notebook: TS - Hartung-Agac-Makabi.ipynb

Data file: StudentStatistics.csv.

*SPSS (not possible)*

*Formulas*

Hartung, Agac, and Makabi (2002) added another modification to the Welch test.

\( F_{HAM} = \frac{\frac{1}{k-1}\times\sum_{j=1}^k w_j^*\times\left(\bar{x}_j - \bar{y}_w^*\right)^2}{1 + \frac{2\times\left(k-2\right)}{k^2-1}\times \lambda^*}\)

\( df_1 = k - 1\)

\( df_2 = \frac{k^2-1}{3\times\lambda^*}\)

\( F_{HAM}\sim F\left(df_1, df_2\right)\)

With:

\( \bar{y}_w^* = \sum_{j=1}^k h_j^*\times \bar{x}_j\)

\( h_j^* = \frac{w_j^*}{w^*}\)

\( w_j^* = \frac{n_j}{s_j^2}\times\frac{1}{\phi_j}\)

\( w^* = \sum_{j=1}^k w_j^*\)

\( \phi_j = \frac{n_j + 2}{n_j + 1}\)

\( \bar{x}_j = \frac{\sum_{j=1}^{n_j} x_{i,j}}{n_j}\)

\( s_j^2 = \frac{\sum_{i=1}^{n_j} \left(x_{i,j} - \bar{x}_j\right)^2}{n_j - 1}\)

\( \lambda^* = \sum_{j=1}^k \frac{\left(1 - h_j^*\right)^2}{n_j - 1}\)

The symbols used are \(k\) for the number of categories,\(x_i,j\) for the i-th score in category j, \(n_j\) the sample size of category j, \(\bar{x}_j\) the sample mean of category j, \(s_j^2\) the sample variance of the scores in category j, \(w_j^*\) the modified weight for category j, \(h_j^*\) the adjusted modified weight for category j, \(df_i\) i-th degrees of freedom, \(\phi_j\) the modification factor.

Note that the R-library 'doex' uses \( \phi_j = \frac{n_j - 1}{n_j - 3}\). The original article though stats that these are unbalanced weights of the Welch test and in their experience, using these makes the test too conservative. In the original article they find from their simulation experience that using \( \phi_j = \frac{n_j + 2}{n_j + 1}\) gives reliable results for small sample sizes, and a large number of populations.

**Özdemir-Kurt B2 ANOVA**

*with Python*

video to be uploaded

Jupyter Notebook: TS - Ozdemir-Kurt B2.ipynb

Data file: StudentStatistics.csv.

*with R (studio)*

video to be uploaded

R script: TS - Ozdemir-Kurt B2.R

Data file: StudentStatistics.csv.

*SPSS (not possible)*

*Formulas*

Özdemir and Kurt (2006) made the following test:

\( B^2 = \sum_{j=1}^k\left(c_j\times\sqrt{\ln\left(1+\frac{t_j^2}{v_i}\right)}\right)^2 \)

Reject null hypothesis if \( B^2 > \chi_{crit}^2\)

\( df = k - 1 \)

\( B^2 \approx \sim\chi^2\left(df\right)\)

With:

\( \bar{y}_w = \sum_{j=1}^k h_j\times \bar{x}_j\)

\( h_j = \frac{w_j}{w}\)

\( w_j = \frac{n_j}{s_j^2}\)

\( w = \sum_{j=1}^k w_j\)

\( \bar{x}_j = \frac{\sum_{j=1}^{n_j} x_{i,j}}{n_j}\)

\( s_j^2 = \frac{\sum_{i=1}^{n_j} \left(x_{i,j} - \bar{x}_j\right)^2}{n_j - 1}\)

\( t_j = \frac{\bar{x}_j - \bar{y}_w}{\sqrt{\frac{s_j^2}{n_j}}} \)

\( v_j = n_j - 1\)

\( c_j = \frac{4\times v_j^2 + \frac{5\times\left(2\times z_{crit}^2+3\right)}{24}}{4\times v_j^2+v_j+\frac{4\times z_{crit}^2+9}{12}}\times\sqrt{v_j} \)

\( z_{crit} = Q\left(Z\left(1 - \frac{\alpha}{2}\right)\right) \)

\( \chi_{crit}^2 = \chi_{crit}^2\left(1 - \alpha, df\right) \)

The symbols used are \(k\) for the number of categories,\(x_i,j\) for the i-th score in category j, \(n_j\) the sample size of category j, \(\bar{x}_j\) the sample mean of category j, \(s_j^2\) the sample variance of the scores in category j, \(w_j\) the weight for category j, \(h_j\) the adjusted weight for category j, \(df\) the degrees of freedom, \(Q\left(x\right)\) the quantile function (= inverse cumulative distribution = percentile function = percent-point function).

FAQ:

Q: What does ANOVA stand for?

A: The term ANOVA is short for ANalyses Of VAriances. It might be a bit strange to look at variance instead of means, but by comparing different variances, something can be said about the means.

Q: Why is it called **one-way** ANOVA, is there also a **two-way** ANOVA?

A: Yes, there is also a two-way ANOVA. The one-way is that we are looking for the influence of one nominal variable on the scale variable, while with a two-way ANOVA you would look for the influence of two nominal variables on one scale variable.

**Nominal vs Scale**

Google adds