Analysing a binary vs. scale variable
Effect size: Cohen's ds
When there is a significant difference, we might also want to check the ‘size’ of the difference. For example if we would have had a difference of 0.0003 in grades, then with extreme large sample sizes this could still be significant, but not really relevant. To measure the size of the difference we would need a so-called effect size
An appropriate effect size in case of a binary and scale variable is Cohen’s ds (Cohen, 1988), although Hedges g (Hedges, 1981) might be preferred in case you have less than 20 respondents (Lakens, 2013).
Cohen’s ds divides the difference of the two means, by the so-called pooled standard deviation (Cohen, 1988, pp. 66-67). In the example this results in a Cohen’s ds of 0.28. Cohen gives a rule of thumb for the interpretation, shown in Table 1.
Cohen’s d | Interpretation |
---|---|
0.00 < 0.20 |
very small |
0.20 < 0.50 |
small |
0.50 < 0.80 |
medium |
0.80 or more |
large |
Click here to see how to determine Cohen's d with SPSS, R (studio), Python an Online Calculator, or Manually.
with SPSS
version 27+
versions prior to 27
Unfortunately versions prior to 27 do not have an option in the GUI to determine Cohen's d. However you can either use the output from the independent samples t-test and enter the results in the online calculator (see below), or use SPSS syntax. The video will show each option
with R (studio)
with Python
Online calculator
Enter the requested information below:
Manually
Cohen's ds formula is:
In this formula is the mean for category i, which can be calculated by:
In this formula xi,j is the j-th score in category i, and ni is the number of cases in category i
spooled is the pooled standard deviation. In formula notation this is:
In this formula N is the total number of cases (combination of both categories), and SSi the sum of squared differences with the mean, which in formula notation is:
An example.
Note: a different example than the one used in the rest of this section, to keep calculations a bit shorter.
Given are the scores of males (category 1) and females (category 2):
The first category, has 5 scores, so n1 = 5, and for the second category we have n2 = 6. Let's begin with determining the mean per category:
Then we can determine the sum of squares per category. First the male category:
And for the female category:
The pooled standard deviation therefor is:
And finally Cohen's ds:
The .28 from the example, would suggest a small effect.
Google adds