# Analysing a binary vs. scale variable

## Effect size: Cohen's ds

When there is a significant difference, we might also want to check the ‘size’ of the difference. For example if we would have had a difference of 0.0003 in grades, then with extreme large sample sizes this could still be significant, but not really relevant. To measure the size of the difference we would need a so-called effect size

An appropriate effect size in case of a binary and scale variable is Cohen’s d_{s} (Cohen, 1988), although Hedges g (Hedges, 1981) might be preferred in case you have less than 20 respondents (Lakens, 2013).

Cohen’s d_{s} divides the difference of the two means, by the so-called pooled standard deviation (Cohen, 1988, pp. 66-67). In the example this results in a Cohen’s d_{s} of 0.28. Cohen gives a rule of thumb for the interpretation, shown in Table 1.

Cohen’s d | Interpretation |
---|---|

0.00 < 0.20 |
very small |

0.20 < 0.50 |
small |

0.50 < 0.80 |
medium |

0.80 or more |
large |

**Click here to see how to determine Cohen's d with SPSS, R (studio), Python an Online Calculator, or Manually.**

**with SPSS**

**version 27+**

**versions prior to 27**

Unfortunately versions prior to 27 do not have an option in the GUI to determine Cohen's d. However you can either use the output from the independent samples t-test and enter the results in the online calculator (see below), or use SPSS syntax. The video will show each option

**with R (studio)**

**with Python**

**Online calculator**

Enter the requested information below:

**Manually**

Cohen's d_{s} formula is:

In this formula is the mean for category i, which can be calculated by:

In this formula *x _{i,j}* is the

*j*-th score in category

*i*, and

*n*is the number of cases in category

_{i}*i*

*s _{pooled}* is the pooled standard deviation. In formula notation this is:

In this formula *N* is the total number of cases (combination of both categories), and *SS _{i}* the sum of squared differences with the mean, which in formula notation is:

**An example.**

*Note*: a different example than the one used in the rest of this section, to keep calculations a bit shorter.

Given are the scores of males (category 1) and females (category 2):

The first category, has 5 scores, so *n _{1 }*= 5, and for the second category we have

*n*= 6. Let's begin with determining the mean per category:

_{2 }Then we can determine the sum of squares per category. First the male category:

And for the female category:

The pooled standard deviation therefor is:

And finally Cohen's d_{s}:

The .28 from the example, would suggest a small effect.

Google adds