# Analysing a single nominal variable

## are the percentages equal? (Pearson chi-square goodness-of-fit test)

**Note**: click here if you prefer to watch a video

**Note**

One question you might have with a nominal variable, is if each category had the same number of respondents (i.e. the same percentage). With the marital status example from the previous paragraphs, we might expect each of the five categories to have (100% / 5 =) 20%. This would mean that we’d expected 20% of 1941 = 388.2 people in each category. This is known as the **expected count** or **expected frequency**.

Our observed frequencies are different from the expected ones. The Pearson chi-square test of goodness-of-fit (Pearson, 1900) can determine if the differences between the observed and expected counts is signficant. If the test result is a p-value below .05 it is usually considered signficant, indicating that there are some significant differences between some categories in frequencies.

One problem though is that the Pearson chi-square test of goodness-of-fit should only be used if not too many cells have a low so-called expected count. For this test it is usually set that all cells should have an expected count of at least 5 (see for example Peck & Devore, 2012, p. 593) (note that for a Pearson chi-square test of independence the conditions are different). See the appendix at the bottom of the page for possible solutions if you do not meet this criteria.

Once you have checked the conditions and looked at the results, you can report the test results. You will need the significance, but also the chi-square value itself, the sample size (number of respondents that answered this question), and the so-called degrees of freedom. This last one is simply for this test the number of categories minus 1. In the example the percentage of cells with an expected count less than 5 is actually 0%, so it is okay to use the test. The test results showed that the sig. was .000. This indicates it was less than .0005 and is then often reported simply as < .001. We had five categories, so the degrees of freedom is 5 – 1 = 4, the sample size was 1941 and the chi-square value (calculated by the software) was 1249.13. The test results could then be reported as something like

A chi-square test of goodness-of-fit was performed to determine whether the marital status were equally chosen. The marital status was not equally distributed in the population, *χ*^{2}(4, *N* = 1941) = 1249.13, *p* < .001.

**Click here to see how to perform a Pearson chi-square goodness-of-fit test, with SPSS, R (Studio), Excel, Python, or manually.**

**with SPSS**

*via One-sample*

Watch the video below, or download the instructions in pdf from here (via bitly, opens in new window/tab).

Click the thumbnail below to see where to look in the output.

*via Legacy dialogs*

Watch the video below, or download the instructions in pdf from here (via bitly, opens in new window/tab).

Click the thumbnail below to see where to look in the output.

**with R (Studio)**

Download R script from video here.

Click the thumbnail below to see where to look in the output.

**with Excel**

Download Excel file from video here.

**with Python**

Download Jupyter Notebook from video here.

**Manually (Formula and example)**

**The formula's**

The Pearson chi-square goodness-of-fit test statistic (χ^{2}):

In this formula *O _{i}* is the observed count in category

*i*,

*E*is the expected count in category

_{i}*i*, and

*k*is the number of categories.

If the expected frequencies, are expected to be equal, then:

The degrees of freedom is given by:

**Example**

We have the following observed frequencies of five categories:

Note that since there are five categories, we have *k* = 5. If the expected frequency for each category is expected to be equal we can use the formula to determine:

Then we can determine the Pearson chi-square value:

The degrees of freedom is:

To determine the signficance you then need to determine the area under the chi-square distribution curve, in formula notation:

This is usually done with the aid of either a distribution table, or some software.

The Pearson Chi-square goodness-of-fit test is actually a so-called omnibus test, it tests all categories in one time. Unfortunately because of this, it does not inform us when it is significant, which categories are significantly different. How to determine this, is discussed in the next part.

As an alternative to the Pearson chi-square test of goodness-of-fit there is also a so-called **G-test**, or sometimes called **likelihood-ratio test** that could be used instead. The advantage of a G-test is that the results are so-called additive, which means they could be combined with other results in larger studies, the disadvantage is that it is a far less familiar test than the Pearson version (McDonald, 2014).

In short the Pearson chi-square test of goodness-of-fit has the following steps:

- The assumption about the population (the null hypothesis (H
_{0})) is that the observed and expected counts will be the same. This implies that the two variables are independent (i.e. one has no influence on the other). - The alternative is that they aren't (H
_{a}). This implies that the two variables are dependent (i.e. one has an influence on the other) - Perform the test and find the p-value (sig.).
- If the p-value is less than .05, the chance of a result as in the sample or even rarer if the assumption is true, is considered so low, that the assumption is probably NOT true. We would then reject H
_{0}and conclude H_{a}. This is then called a significant result. - If the p-value is .05 or more, the chance of a result as in the sample or even rarer if the assumption is true, is considered not low enough, that the assumption could be true. We don't have enough evidence to reject the assumption. This is then called a non-significant result.

Appendix

## Click here to see the formula's to calculate likelihood ratio goodness-of-fit test.

In the formula's the following variables will be used:

*O _{i}* is the observed count in category

*i*

*E*is the expected count in category

_{i}*i*

*k*is the number of categories

The likelihood-ratio goodness-of-fit test statistic (*G*):

The degrees of freedom (*df*):

## A bit more background on the chi-square test

Small differences between observed and expected countswill almost always occur, but if the differences are big, then perhaps not only in the sample will they be different, but also in the population. The big question then is: how big should the differences (between the observed and expected) be, in order to conclude that there will also be differences in the population?

Karl Pearson is a famous statistician who answered this question. He figured out that if all percentages would be equal in the population, and you would take all possible samples from that population, calculate a ‘funny’ number each time, these ‘funny’ numbers will form a so-called chi-square distribution. The official name for this ‘funny’ number is a chi-square value. The 'chi' is the Greek letter *χ* , and with the 'square' it is therefor often written as χ^{2} (and 'chi' is therefor not pronounced as chi in 'thai chi').

So, with one sample we can calculate one chi-square value, and thanks to Mr. Pearson we also know that if the percentages are equal in the population, then this is one of the possible chi-square values that form a chi-square distribution. If we'd had a chi-square value of 10 then visually that might look something like is shown in Figure 1.

*Figure 1*. Example of a chi-square distribution

The curve is drawn using a difficult equation and the height is actually not so important, the area under the curve is far more interesting. The areas under the curve will determine the probability. The shaded area in Figure 1 is the chance of such a chi-square value of 10, or an even bigger one and known as the **significance**, or **p-value**. It is the chance of a result as in the sample, or even bigger, if the assumption about the population is true. In this test the ‘result as in the sample’ is the chi-square value of the sample and is ‘the assumption’ the assumption that the percentages are equal.
If this chance (of a chi-square value as in our sample, or even a bigger one) is very low it could mean that:

1) We were simply very unlucky and have one of the very few rare situations or;

2) The assumption that the percentages are equal in the population is incorrect

It is convention to go for the second option if the chance is lower than .05 (less than 5%).
All of the above is known as a * Pearson chi-square goodness-of-fit test* (Pearson, 1900). If the significance of this test is below 0.05, the percentages are most likely in the population also not equally distributed.

How to calculate the area under that curve is quite complex and usually left over to a software package like SPSS, R, or Excel, using an online calculator, or by looking it up in some tables. See below on how to perform a Pearson chi-square goodness-of-fit test with SPSS, R, or Excel.

## If you don't meet the criteria

If you do not meet the criteria, there are three options. First off, are you sure you have a nominal variable, and not an ordinal one? If you have an ordinal variable, you probably want a different test. If you are sure you have a nominal variable you might be able to combine two or more categories into one larger category. If for example you asked people about their country of birth, but a few countries were only selected by one or two people, you might want to combine these simply into a category ‘other’. Be very clear though in your report that you’ve done so. Another option is to use a so called **exact-multinomial test**. This test does not have the criteria but will need larger differences before it will actually determine some of them to be significant. This test is out of the scope for this document.

**Single nominal variable**

Google adds