Analysing a nominal and ordinal variable
Part 3a: Test for differences
On the previous page, we noticed in the sample that the results in Diemen seem more positve than on the other two locations. To test if this might also be the case in the population we could use a so-called Kruskal-Wallis H test (Kruskal & Wallis, 1952). This will look at so-called rankings and not simply the median of each category. If you want to test truly if the median for each category is different, you should use Mood's Median Test (Mood, 1950).
In the output/results of the test you will find the significance. In this example it is 0.000, which is the chance of having a sample with a Kruskal-Wallis H value of 21.328 or even higher, if in the population there would be no difference between the three groups. Since this chance is so low (below 0.050), we can conclude that most likely in the population there will be a difference between the three locations.
In formal APA style we can report this result as:
A Kruskal-Wallis test showed that Location had a modest significant effect on how motivated students were by the teacher, χ2(2, N = 54) = 21.33, p < .001.
Click here to see how you can perform a Kruskal-Wallis H test, with SPSS, R (studio), Excel, Python, or Manually.
with SPSS
The video below shows how to perform the Kruskal-Wallis H test with SPSS, and also already the post-hoc test that will be discussed on the next page
click here if the pairwise comparison will not show
with SPSS 22 or 23 you might get an error for the pairwise comparison, this could be resolved following the steps in the video below
Click on the thumbnail in the table below to see where you can find the numbers in the output.
with R (Studio)
Click on the thumbnail in the table below to see where you can find the numbers in the output.
with Excel
with Python
Manually (formulas and example)
Formula
The formula for the H-statistic is:
In this formula N is the total sample size, k is the number of categories, ni the number of cases in category i, the average rank of the ranks in category i,
the average rank of all ranks, and ri,j the rank of the j-th score in category i.
The degrees of freedom is given by:
Example
Note: different example than the one used in the rest of this section.
We are given scores on an ordinal scale from three categories:
Since there are five scores in the 1st category, four in the 2nd and four in the 3rd we can determine:
In total we have 5 + 4 + 4 = 13 scores, and there are three categories, so:
To determine the rank, we combine all scores into one large sequence:
Then we sort this sequence:
The lowest score is a 1, which occurs twice. So they get rank 1 and 2, or on average 1.5. Then there are four 2's, so these get ranks 3, 4, 5, and 6, or on average 4.5. Then two 3's that get rank 7 and 8, or on average 7.5, then two fours so ranks 9 and 10, on average 9.5, and three 5's, so ranks 11, 12 and 13, on average 12. It is these average ranks that we will use.
Each score of 1 has a rank of 1.5, each score of 2 a rank of 4.5, a score of 3 a rank of 7.5, a score of 4 a rank of 9.5 and a score of 5 a rank of 12.
The average ranks per category now become:
And the average of all ranks:
Now we can tackle that big formula. Lets start with the numerator (the top part) of the fraction:
Then the denominator:
Lets do each of these terms (the sums) separately:
Using these three results we can now determine:
And finally the H-statistic:
And the degrees of freedom:
Note that the Kruskal-Wallis test eventually uses a chi-square distribution, which is why the χ2 is shown.
Now that we know there seems to be an influence of location on the motivation, we can explore for which location(s) the difference is significant, known as a post-hoc test and discussed on the next page.
Nominal vs. Ordinal
Google adds