Nominal vs Nominal

Part 3: Testing

If you are only interested if the overall distribution changed (i.e. if the percentages from each category changed or not), you can perform a marginal homogeneity test. There are two that seem to be quite popular for this, the Stuart-Maxwell test (Stuart, 1955; Maxwell, 1970), and the Bhapkar test (Bhapkar, 1966). According Uebersax (2006) (which also has a nice example) the Bhapkar one is preferred.
We can also test if the changes are 'symmetric', in other words if the number of people who switched for example from A to B is the same as the number of people who switched from B to A. This can be done with a McNemar-Bowker test (McNemar, 1947 ; Bowker, 1948).

The Bhapkar test.

In the example the Bhapkar test will show a chi-squared value of 15.423 with a significance of 0.0004. This indicates that there is a .0004 (0.04%) chance to have chi-squared value of 15.423 or even bigger, if the overall percentages for each categories would not have changed in the population. This chance is so low, that it probably also changes in the population. We can therefor speak of a significant result and report it as:

A Bhapkar test revealed that the commercials have a significant effect on the overall results, χ²(2, N = 128) = 15.42, p < .001.

Click here to see how you can perform a Bhapkar test, with R (Studio), Excel, Python, or special software.

with SPSS (not possible)

Unfortunately it is not possible in the GUI of SPSS to perform a Bhapkar test. You can create a cross table with SPSS and use the table as the input for any of the other programs mentioned below.

with R (Studio)

Click on the thumbnail below to see where to look in the output.

with Excel

with Python

specialised (free) software

John Uebersax created a small free dos program that can perform the test. You can download the program from here, and watch the instructions below. This will also show the post-hoc test already.

Unfortunately the Bhapkar test is an omnibus test, so it does not tell us which categories changed significantly in proportions. For that we can use a McNemar test, but since we are performing multiple tests now we have to adjust the significance level. One method to adjust for multiple testing is the Bonferroni method.

In the example we can see that for brand A and brand B the significance after the Bonferroni adjustment is still below 0.05, which indicates that for those two brands the percentage has indeed changed significantly.

A Bhapkar test revealed that the commercials have a significant effect on the overall results, χ²(2, N = 128) = 15.42, p < .001. A McNemar post-hoc test with Bonferroni adjustment revealed that the proportions before vs after changed significantly for brand A (p = .005) and brand C (p = .003).

Click here to see how you can perform the post-hoc test, with Excel, Python, or special software.

with Excel

with Python

specialised (free) software

John Uebersax created a small free dos program that can perform the test. You can download the program from here, and watch the instructions below. This will also show the Bhapkar test itself.

The McNemar-Bowker test

You can also perform the McNemar-Bowker test, or sometimes simply called Bowker test.

The results show for the example that the Bowker test had a chi-squared value of 13.769 with a significance of 0.0032. This indicates that there is a .0032 (0.32%) chance to have chi-squared value of 13.769 or even bigger, if the changes are symmetrical in the population. This chance is so low, that probably the changes will also not be symmetrical in the population. We can therefor speak of a significant result and report it as:

A Bowker test revealed that the changes in opinion on favourite brand before and after the commercials were shown, were not symmetrical, χ²(3, N = 128) = 13.77, p = .003.

Click here to see how to perform a McNemar-Bowker with SPSS, R (Studio), Excel, or Python

with SPSS

Click on the thumbnail below to see where to look in the output.

with R (Studio)

Click on the thumbnail below to see where to look in the output.

with Excel

with Python

The Bowker test is an omnibus test so it does not reveal which changes are then significant. To find out we can use a pairwise comparison for each 2 by 2 sub table, perform a McNemar test on this and adjust for the the multiple testing.

In this example only the McNemar test for the change from B to C is not symmetric with the changes from C to B (significance is 0.041). However we have to adjust for the multiple testing. In this example we have done three tests, so we have to multiply the significance with this (the Bonferroni adjustment), and unfortunately then also this one becomes bigger than 0.05 (it will be 3 x 0.041 = 0.123). This means that although overall the changes are not symmetrical, it cannot be pinpointed to a specific change.

Click here to see how to perform this post-hoc test with SPSS, R (Studio), Excel, or Python.

with SPSS

with R (Studio)

with Excel

with Python

It is possible to have both the Bowker and Bhapkar test to be significant, to have both as not significant, and to have the Bowker test as significant but not the Bhapkar test. It is in my understanding not possible to have the Bowker test as not significant but the Bhapkar test as significant. Since if the table is symmetrical the marginal proportions should also be equal.

Let's complete the report on the next page.

Nominal vs Nominal

Google adds