Analysing a binary variable
Test: Binomial test
(if you prefer to watch a video on this than read, click here)
In the example we noticed that 26% of the respondents indicated to be Female, and 74% Male. This might appear as a big difference, but is it a ‘significant’ difference, i.e. will there also be a difference in the population.
As discussed in the general section on significance (see here), the significance is the probability of a result as in the sample or even more extreme, if the assumption about the population is true.
With only two options to choose from, most often the assumption about the population is here that both groups are equal. This would mean that if we pick a random person from the population, the chance of him/her belonging to either category is 0.5 (50%).
The result in the sample was that we had 12 Female respondents. ‘More extreme’ would be less than 12. What we can ‘easily’ determine now is the probability of getting 12 or less Female out of 46, if in the population the chance of picking a female is 0.5. This can be done using a so-called binomial distribution. The chance for this is 0.0008. However, ‘more extreme’ can also mean that we have a similar over-representation of females. If the expected proportion is 0.5 (50%) then we can simply double the result, so the significance is 2 x .0008 = 0.0016.
We could report this as:
An exact binomial test indicated that the percentage of female (Nf = 12, 26%), was significantly different from the male percentage (Nm = 34, 76%), p = .002.
Click here to see how to perform a biniomial test...
with Excel
Excel file from video TS - Binomial Exact Test.xlsm.
with Flowgorithm
A basic implementation for a one-sample binomial test is shown in the flowchart in figure 1
Figure 1
Flowgorithm for one-sample binomial test
It takes as input the frequency of one of the categories (k) and the sample size (n). This makes use of the binomial distribution cumulative density function.
Flowgorithm file: TS - Binomial (one-sample).fprg.
with Python
or without using any libraries:
Jupyter Notebook from videos TS - Binomial Exact Test.ipynb.
Datafile used in video: StudentStatistics.csv
Basic code example:
# libraries needed
import pandas as pd
from scipy.stats import binom_test
# some data
myDf = pd.read_csv('../../Data/csv/StudentStatistics.csv', sep=';')
myCd = myDf['Gen_Gender'].value_counts()
# the test
binom_test(myCd.values[0], sum(myCd.values), 1/2, alternative='two-sided')
with R (Studio)
R script from video TS - Binomial Exact Test.R.
Datafile used in video: StudentStatistics.sav
Basic code example:
#one sample binomial test
#Preparation
#Getting some data
#install.packages("foreign")
library(foreign)
myData <- read.spss("../Data Files/StudentStatistics.sav", to.data.frame = TRUE)
#Remove na's
myVar <- na.omit(myData$Gen_Gender)
#Determine number of successes
k <- sum(myVar==myVar[1])
#Determine total sample size
n <- length(myVar)
#Test if expected both groups to be equal
#Perform binomial test
binom.test(k,n)
#Or use binomial distribution directly
2*pbinom(k,n,.5)
with SPSS
using non-parametric tests
Datafile used in video: StudentStatistics.sav
using Legacy Dialogs
Datafile used in video: StudentStatistics.sav
using compare means
Datafile used in video: StudentStatistics.sav
Manually (Formula's)
A one-sample binomial test, is almost 'just' the same as using the binomial distribution.
Given a probability of success (p), which for the binomial test is the expected proportion in the population, the number of trials (n), which for the binomial test is the total sample size, and the number of successes (k), which for the binomial test is number of occurences in one of the categories.
The formula for the cumulative binomial distribution (F(k; n,p)) is:
\(F\left(k;n,p\right)=\sum_{i=0}^{\left\lfloor k\right\rfloor}\binom{n}{i}\times p^{i}\times\left(1-p\right)^{n-i}\)
If p = 0.5 the formula could be simplfied into:
\(F\left(k;n,0.5\right)=0.5^{n}\times\sum_{i=0}^{\left\lfloor k\right\rfloor}\binom{n}{i}\)
In the formula ⌊k⌋ is the 'floor' function. This gives the greatest integer (whole number) less than or equal to k. So for example ⌊2.8⌋ = 2, and ⌊-2.2⌋=-3.
\(\binom{n}{i}\) is the binomial coefficient, this can be calculated using:
\(\binom{n}{i}=\frac{n!}{i!\times\left(n-i\right)!}\)
In this formula the ! indicates the factorial operation:
\(n!=\prod_{i=1}^{n}i\), and 0! is defined as 0! = 1.
These formulas are discussed in more detail in the binomial distribution section
Note that if you have a different expectation about the population than the 0.50, we cannot simply double the result anymore. Well we could for a quick approximation, provided the sample size is large, but usually a more complex technique is then used, known as the ‘method of small p’, and in some cases another method known as ‘equal distance’. How these methods work is discussed in the end notes at the bottom of this page.
The binomial test can be computational heavy, so sometimes an approximation is used. The approximation is then either done using the Normal distribution, or a goodness-of-fit test. In both cases so-called continuity corrections can be applied and there are different variations on these corrections.
In short the binomial test has the following steps:
- The assumption about the population (the null hypothesis (H0)) is that the proportion of one of the two categories will be some amount (e.g. 0.5).
- The alternative is that it isn't (Ha) (e.g. the proportion in the population is not 0.5). This would be the so-called two-tailed test.
- Perform the binomial test and find the p-value (sig.).
- If the p-value is less than .05, the chance of a result as in the sample or even rarer if the assumption is true, is considered so low, that the assumption is probably NOT true. The proportion in the population is then probably NOT the one assumed at step 1. This is then called a significant result.
- If the p-value is .05 or more, the chance of a result as in the sample or even rarer if the assumption is true, is considered not low enough, that the assumption could be true. We don't have enough evidence to reject the assumption. This is then called a non-significant result.
The test informs us that there most likely will also be a difference in the population, it however does not say anything about if it is a big or small difference. For that we need a so-called effect size, which is the topic for the next section.
Appendix
What if the expected proportion is not 0.50? (click to see the answer)
If in the example we’d expected 30% to be Female, our assumption about the population would change from 0.5 to 0.3. There are two methods we can then use to interpret ‘or more extreme’. The method of equal distance, and the method of small p-values.
The method of equal distance
This method looks at the number of cases. In a sample of 46 people, we’d then expect 46 x 0.3 = 13.8 Female respondents. We only had 12, so a difference of 13.8 – 12 = 1.8. The ‘equal distance method’ now means to look for the chance of having 12 Female or less, and 13.8 + 1.8 = 15.6 Female or more. Each of these two probabilities can be found using a binomial distribution. The ’12 or less’ probability is 0.3448, and the ’15.6 or more’ probability is 0.4031 (note that the 15.6 is always rounded down). Adding these two together than gives the two-sided significance of 0.7479.
The method of small p-values
This method looks at the probabilities itself. The probability of having exactly 12 Female scores out of a group of 46, if the chance of Female is 0.3, is 0.1119 (this is again a binomial distribution). The method of small p-values now considers ‘or more extreme’ any number between 0 and 49 (the sample size) that has a probability less or equal to the 0.1119. This means we need to go over each option, determine the probability and check if it is lower or equal. So, the probability of 0 Female, the probability of 1 Female, etc. In the example all counts of 12 or less, and all of 16 or more, each have a probability of 0.1119 or less. In total this results in a two-sided significance of 0.6300.
Approximation using a Proportion test
(if you prefer to watch a video on this than read, click here)
The binomial distribution, which the binomial test uses, can be approximated with a normal distribution, if the sample size is large enough. The calculations for a normal distribution are less computational heavy than for the binomial test, and therefor could be a useful alternative if your sample size is very large.
Using this normal approximation is sometimes referred to as a score test. Since the binomial distribution is a so-called discrete distribution, and the normal is continuous, a so-called continuity correction is often recommended. The most famous one is probably the Yates correction.
This score test uses the expected proportion in it's calculation, however alternatively you could use the sample proportion. In that case you get a so-called Wald test (also with a Yates correction if you like).
But wait, there is more. A chi-square goodness-of-fit (GoF) test could also be used, and for this there are also two flavours: Pearson chi-square GoF and a G GoF (which is also known as a Likelihood Ratio GoF). For these we can also apply corrections. The one from Yates, but also a E.S. Pearson correction, or a Williams correction.
With all these variations it is hard to choose. I'll follow the guidelines from Statstest.com, they recommend to use the exact binomial if the sample size is below 1000, and a G-test otherwise with Yates correction. The G-test is less known, so if you want a more well known test the Pearson version (with Yates) should be fine as well.
Click here to see how to perform a proportion test. For the GoF tests see the test for a single nominal variable
with Excel
Excel file from video TS - Proportion Test.xlsm.
with Flowgorithm
A basic implementation for a one-sample proportion test is shown in the flowchart in figure 2
Figure 2
Flowgorithm for one-sample proportion test
It takes as input the frequency of one of the categories (k), the sample size (n), if the Wald test should be used, and if Yates correction should be applied. This makes use of the standard normal distribution cumulative density function.
Flowgorithm file: TS - Proportion test (one-sample).fprg.
with Python
Jupyter Notebook from videos TS - Proportion Test.ipynb.
Datafile used in video: StudentStatistics.csv
with R (Studio)
Jupyter Notebook from videos: TS - Proportion test (one-sample).ipynb.
R script: TS - Proportion test (one-sample).r.
Datafile used in video: StudentStatistics.sav
with SPSS
Datafile used in video: StudentStatistics.sav
Formulas
The normal approximation of the binomial is usually done with:
\(z=\frac{x-\mu}{\sigma}\)
where \(\mu = n\times p_0\) and \(\sigma=\sqrt{\mu\times\left(1-p_0\right)}\)
Here \(p_0\) is the expected proportion (the proportion according to the null hypothesis), and \(x\) the observed count in the sample
This will then follow a standard normal distribution, from which p-values can easily be calculated
For the Wald test we can do the exact same, but only change our \(\sigma\) to:
\(s = \sqrt{x\times\left(1 - \frac{x}{n}\right)}\)
For the Yates correction we need to determine the absolute value of the numerator in the z-formula and subtract 0.5. i.e.
\(z_{Yates} = \frac{\left|x - \mu\right| - 0.5}{\sigma}\)
For the chi-square tests, please refer to the formulas at the 'analyzing a single variable' and then the test section'
Single binary variable
Google adds