Nominal vs. Nominal

Part 3a: Test for association (Pearson chi-square test of independence)

To test if two nominal variables have an association, the most commonly used test is the Pearson chi-square test of independence. If the significance of this test is below 0.05, the two nominal variables have a significant association.

One problem though is that the Pearson chi-square test should only be used if not too many cells have a so-called expected count, of less than 5, and the minimum expected count is at least 1. So you will also have to check first if these conditions are met. Most often ‘not too many cells’ is fixed at no more than 20% of the cells. Note that there are othes who would say that all cells should have an expected count of at least 5.

Once you have checked the conditions and looked at the results, you can report the test results. In the example the percentage of cells with an expected count less than 5 is actually 0%, so it is okay to use the test. The test results could than be reported as something like:

Gender and marital status showed to have a significant association, χ²(4, N = 1941) = 16.99, p < .001.

Click here to see how you can perform the test with SPSS, R, Excel or a TI-83

with SPSS

Click on the thumbnail below to see where you can find each of the values mentioned in the output of the software.

with R

Click on the thumbnail below to see where you can find each of the values mentioned in the output of the software.

with Excel

with a TI-83

You might now also wonder what then the association is (which marital status is differently chosen by men and women). This will be the topic on the next page.

FAQ's: (click on the question to see the answer).

What if I do not meet the conditions?

If your data does not meet the two criteria, all is not lost. You could perhaps combine some categories that have a low count (e.g. combine all marital status that are not married into one), or you can perform a so-called Fisher exact test. Click the button below on how to perform a Fisher exact test

with SPSS

With a Fisher exact test we only need to check the significance, and the interpertation goes similar to that of the Chi-square test. In the report this might go something like:

a two-sided Fisher exact test showed that gender and marital status have a significant association (N = 1941, p < .001).

What are these 'expected values'?

The expected values are the number of respondents you would expect if the two variables would be independent.

If for example I had 50 male and 50 female respondents, and 50 agreed with a statement and 50 disagreed with the statement, the expected value for each combination (male-agree, female-agree, male-disagree, and female-disagree) would be 25.

Note that if in the survey the real results would be that all male disagreed, and all female would agree, there is a full dependency (i.e. gender fully decides if you agree or disagree), even though the row and column totals would still be 50. In essence the Pearson chi-square test, checks if your data is more toward the expected values (independence) or the full dependency one.

Who came up with this stuff?

The Pearson chi-square test is named after Karl Pearson, who described the test in 1900.

The condition of at most 20% is often attributed to Cochran (1954, p. 420), but it was Fisher (1925, p. 83) who was more strict in not allowing any cells with an expected count of less than 5.

The Fisher exact test is named after Ronald Aylmer Fisher who described the test in 1925.

Are there any alternatives or variations?

The Pearson chi-square test and the Fisher exact test are probably the two most frequently used tests in this situation, however other tests also exist which some claim to perform even better. An alternative for the Pearson chi-square test is the G-test (also known as a likelihood ratio test), and for the Fisher exact test, the Barnard test and the Boschloo test.

Another option worth mentioning is that for a chi-square test (such as Pearson and the G-test) some corrections have been suggested, these include the Yates correction (Yates, 1934), the Williams correction (Williams, 1976), and the E.S. Pearson correction (Pearson, 1947).

How do you get that chi symbol (χ) in Word?

Type in the letter 'c', then select it and change the font to 'Symbol'

Two nominal variables

Introduction

Impression

Visualisation

Test for association <=

Post-hoc test

Effect size

Reporting

Google adds