# Analysing a binary variable

## Effect size: Cohen's g

**(if you prefer to watch a video on this than read, click here)**

The one-sample binomial test can inform us if the percentage in the population will be significantly different from the 50%, but does not say anything on how big the difference is. With only two categories we can simply leave it up to the reader to judge if he/she finds the difference in the two percentages big or small, but for many tests it is recommended to also give a so-called effect size measure.

Unfortunately for the one-sample binomial test, there is not much written about reporting an effect size. Rosnow and Rosenthal (2003) mention as effect size for binary (they call it dichotomous) data Cohen's g and Cohen's h. Cohen's h is also the effect size used for two proportions in the NCSS software (n.d.-b). JonB (2015) on CrossValidated suggests to use Relative Risks, which the NCSS calls Alternative Ratio (n.d.-a).

I’ll use Cohen’s g myself, but for the interested reader the other two are explained in the end notes at the bottom of this page.

Cohen’s g (Cohen, 1988) is specifically for the case where the expected proportion in the population is 0.5 (50%). It is then simply the difference of the sample proportion with this 0.5. In the example the female proportion was 0.26 (26%), so Cohen’s g is the difference with the expected proportion which is simply 0.26 – 0.50 = -0.24. Note that if we had taken the male proportion we would have gotten 74 – 0.50 = 0.24. The only difference is the negative sign. Cohen’s g is therefor often reported with the absolute value (so without a negative sign, this is then known as a nondirectional Cohen’s g). In the example the two sample proportions were 24% higher or lower than expected.

Cohen provided some rule of thumb to interpret this, shown in Table 1.

Cohen’s g | Interpretation |
---|---|

0.00 < 0.05 | Negligible |

0.10 < 0.15 | Small |

0.20 < 0.25 | Medium |

0.25 or more | Large |

The 0.24 would fall in the Medium category (but is very close to the Large). We could add this to our findings:

An exact binomial test indicated that the percentage of female (*N _{f}* = 12, 26%), was significantly different from the male percentage (

*N*= 34, 76%),

_{m}*p*= .002. Cohen’s g suggests that the difference can be classified as medium,

*g*= .24.

the last step is to write all the results into a report, which will be discussed in the next section.

**Click here to see how to determine Cohen's g with SPSS, R (Studio), Excel, Python, an Online calculator, or Manually**

**with SPSS**

**with R (Studio)**

Download R script from video here.

**with Excel**

Download Excel file from video here.

**with Python**

Download Jupyter Notebook from video here.

**Online calculator**

Enter the number of cases of the first category, then the total sample size:

**Manually (using Formula)**

Given a sample proportion (*p*) and the expected proportion in the population (*π*), the formula for Cohen's g will be:

The sample proportion in the example was 0.26 and the expected proportion was 0.50, in the example this therefor gives:

Often the absolute value is used (the so-called nondirectional Cohen's g):

The last step is to report the results. This will be discussed in the next section.

## End note: Alternatives for Cohen's g (click to expand)

Two alternatives as effect size measure for a one-sample binomial test are the Alternative Ratio (or Relative Risk) and Cohen's h.

**Alternative Ratio/Relative Risk (click to expand)**

The Alternative Ratio is only mentioned in the documentation of a program called PASS (NCSS, n.d.), and referred to as Relative Risk by JonB (2015). Relative Risk is more often used with cross tables, so I’ll stick with the ‘Alternative Ratio’. It is simply the sample proportion (percentage), divided by the expected population proportion (which we set at 0.5 (50%)). In the example the sample proportion of the female was 0.26, and dividing this by 0.5 gives an Alternative Ratio of 0.26 / 0.5 = 0.52. This means that the female proportion was (1 – 0.52) = 48% lower than expected. Similar for the male we get 0.74 / 0.5 = 1.48. This indicates that the male proportion is 48% higher than expected. Unfortunately, there is no rule to determine if 48% is high or low (although most people would find it pretty high).

## Click here to see how to determine the Alternative Ratio with an Online calculator, or Manually

## with R (studio)

(to be uploaded)

## with Excel

(to be uploaded)

## Online calculator

Enter the number of cases in the category of interest, then the total sample size, and the expected proportion (usually 0.5 for binary data):

## Manually (with Formula)

Given a sample proportion (*p*) and the expected proportion in the population (*π*), the formula for the Alternative Ratio (Relative Risk) will be:

In the example the sample proportion of female was 0.26 and the expected proportion in the population 0.50. Filling this in the formula yields:

And for the male proportion, which was 0.74 in the sample, we get:

*Click here if you prefer to watch a video on the explanation of Cohen's h*_{2}

_{2}

**Cohen’s h**_{2} (click to expand)

_{2}(click to expand)

Cohen’s h_{2} looks at the difference between two proportions. However, it does not consider the difference of the two sample proportions directly, but rather the arcsin transformations of their square root. Cohen explains that although 0.65 and 0.45 have the same difference as 0.25 and 0.05, the power of these two is actually different. To compensate for this a non-linear transformation is used and the arcsin seems to do the trick. For more info see page 180 & 181.

Note that Cohen's h is slightly different from Cohen's h_{2}. The general Cohen's h is used for so-called paired samples, not for a one-sample scenario, like a binomial test. For the interpretation Cohen gives guidelines for only Cohen's h, not h_{2}, but does give a conversion on page 203:

The interpretation of h can then be done using (adapted from Cohen (1988, p. 198)) :

Cohen’s h | Interpretation |
---|---|

0.00 < 0.20 | Negligible |

0.20 < 0.50 | Small |

0.50 < 0.80 | Medium |

0.80 or more | Large |

## Click here to see how to determine Cohen's h_{2} with SPSS, R studio, Excel, Python, Online calculator, or Manually (using formula)

## with SPSS

## with R (studio)

## with Excel

## with Python

## Online calculator

Enter the number of 'successes' (or number of respondents of the first category, the total sample size, and the expected proportion:

*Manually (using formula)*

The formula for Cohen's h_{2} will be:

Where *φ _{i}* is determined by:

Where *p _{i}* is the sample proportions of category i, arcsin the inverse sinus function (also known as sin

^{-1}), and

*p*the expected proportion

_{c}In the example the female proportion is approximately 0.2609, filling this in for phi (*φ*) gives:

For the expected proportion, which was 0.50 we get:

Filling these results in the formula for Cohen's h_{2} we get:

And just for the interpretation, converting this to Cohen's h we get:

**Single binary variable**

Google adds