Module stikpetP.other.p_adjustments
Expand source code
import copy
def p_adjust(p_values, method="bonferroni", alpha=0.05):
'''
P-Value Adjustments for Multiple Testing
----------------------------------------
Various methods exist to counter a problem with multiple testing. Bonferroni, Šidák, Hommel, Holm, Holm-Šidák, and Hochberg all attempt to control the family wise error rate (FWER), while Benjamini-Hochberg and Benjamini-Yekutieli attempt to control the false discovery rate (FDR).
FWER methods want to minimize the chance of making at least one Type I error (incorrectly rejecting the null hypothesis), while FDR methods attempt to balance the false positive and false negatives.
Parameters
----------
p_values : list
the various p-values
method : {'bonferroni', 'sidak', 'hommel', 'holm', 'holm-sidak', 'hochberg', 'bh', 'by', 'hommel-original', None} : string, optional
method to use for adjustment, Default is 'bonferroni'.
alpha : float, optional
the alpha level to use, only applies to 'hommel-original'. Default is 0.05.
Returns
-------
p_adj_val : list
the adjusted p-values
Notes
-----
**Bonferroni**
The formula used for the Bonferroni adjustment:
$$\\tilde{p}_i = \\min\\left(1, p_i \\times k\\right)$$
Where $p_i$ is the p-value of test $i$, and $k$ the number of tests.
Dunn (1961, p. 53) uses the Bonferroni inequality to adjust confidence intervals, which is why this is also called the Dunn-Bonferroni adjustments.
Bonferroni describes these inequalities in two papers (1935, 1936), but unfortunately I do not read Italian. The term 'Bonferroni inequalities' can already be found in Feller (1950, p. 75)
**Šidák**
The formula used (Šidák, 1967, p. 629):
$$\\tilde{p}_i = \\min\\left(1, 1 - \\left(1 - p_i\\right)^k\\right)$$
Where $p_i$ is the p-value of test $i$, and $k$ the number of tests.
**Hommel**
The algorithm used (Wright, 1992, p. 1013):
1. Set \\(a_i = p_i\\) for all \\(i\\)
2. For each \\(m = k, (k - 1), ..., 2\\) (i.e. in descending order) do the following:
2.1. For \\(i > (k - m)\\)
* 2.1.1. calculate \\(c_i = \\frac{m\\times p_i}{m + i - k}\\)
* 2.1.2. set \\(c_{min} = \\min\\left(c_i, for all i\\right)\\)
* 2.1.3. if \\(a_i < c_{min}\\) then \\(a_i = c_min\\)
2.2. For \\(i \\leq (k - m)\\)
* 2.2.1. let \\(c_i = \\min\\left(c_{min}, m \\times p_i\\right)\\)
* 2.2.2. if \\(a_i < c_i\\) then \\(a_i = ci\\)
3. \\(p_i^{adj} = a_i\\)
Where \\(p_i\\) is the p-value of test $i\\) after sorting all p-values in ascending order, and \\(k\\) the number of tests.
Hommel (1988) original procedure is I think slightly different and implemented in 'hommel-original'. The method from Wright seems to be the method used in the multipletests() function from the Python library statsmodels.stats.multitest, and the p.adjust() function from R's stats library. The advantage of Wright's algorithm, is that it doesn't require the alpha level to be known to adjust the p-values.
**Holm**
The formula used (SAS, n.d.):
$$\\tilde{p}_i =\\begin{cases} k\\times p_1 & i= 1\\\\ \\max \\left(\\tilde{p}_{i-1}, p_i \\times \\left(k+1-i\\right)\\right) & i=2,\\dots,k \\end{cases} $$
Where \\(p_i\\) is the p-value of test $i\\) after sorting all p-values in ascending order, and \\(k\\) the number of tests.
Holm (1979, p. 67) describes this procedure, but uses alpha level.
**Holm-Šidák**
The formula used (SAS, n.d.):
$$\\tilde{p}_i =\\begin{cases} 1 - \\left(1 - p_1 \\right)^k & i= 1\\\\ \\max \\left(\\tilde{p}_{i-1}, 1 - \\left(1 - p_i\\right)^{k - i + 1}\\right) & i=2,\\dots,k \\end{cases} $$
Where \\(p_i\\) is the p-value of test $i\\) after sorting all p-values in ascending order, and \\(k\\) the number of tests.
This uses Holm (1979, p. 67) step-down approach, but instead of using the Bonferroni adjustment, it uses Šidák.
**Hochberg**
The formula used (SAS, n.d.):
$$\\tilde{p}_i =\\begin{cases} p_i & i= 1\\\\ \\min \\left(\\tilde{p}_{i-1}, i \\times p_i\\right) & i=2,\\dots,k \\end{cases} $$
Where \\(p_i\\) is the p-value of test $i\\) after sorting all p-values in DESCENDING order, and \\(k\\) the number of tests.
The procedure is described by Hochberg (1988, p. 801) using alpha levels for the criteria.
**Benjamini-Hochberg**
The algorithm used (Benjamini & Hochberg, 1995, p. 293):
1. Sort the p-values in ascending order
2. find \\(j = \\max\\left(i | p_i \\times \\frac{k}{i} \\leq \\alpha\\right)\\)
3. All tests \\(i \\leq j\\) are considered significant.
To find the adjusted p-values (in reverse order):
$$\\tilde{p}_i =\\begin{cases} p_k & i= k\\\\ \\min \\left(\\tilde{p}_{i+1}, p_i \\times \\frac{k}{i}\\right) & i=(k-1),\\dots,1 \\end{cases} $$
Where \\(p_i\\) is the p-value of test $i\\) after sorting all p-values in ascending order, and \\(k\\) the number of tests.
**Benjamini-Yekutieli**
The algorithm used (Benjamini & Yekutieli, 2001, p. 1169):
1. Sort the p-values in ascending order
3. Determine \\(C(k) = \\sum_{i=1}^k \\frac{1}{i}\\)
2. find \\(j = \\max\\left(i | p_i \\times \\frac{k\\times C(k)}{i} \\leq \\alpha\\right)\\)
3. All tests \\(i \\leq j\\) are considered significant.
To find the adjusted p-values (in reverse order):
$$\\tilde{p}_i =\\begin{cases} p_k\\times C(k) & i= k\\\\ \\min \\left(\\tilde{p}_{i+1}, p_i \\times \\frac{k\\times C(k)}{i}\\right) & i=(k-1),\\dots,1 \\end{cases} $$
Where \\(p_i\\) is the p-value of test \\(i\\) after sorting all p-values in ascending order, and \\(k\\) the number of tests.
**Hommel Original**
Hommel (1988, p. 384) describes the following algorithm:
1. Compute \\(j = \\max\\left(i \\in {1,\\dots, k}| p_{k-i+j} > j\\times\\frac{\\alpha}{i} for j=1,\\dots, i\\right)\\)
2. If the maximum does not exist, reject all, otherwise reject all with \\(p_i \\leq \\frac{\\alpha}{j}\\)
Where \\(p_i\\) is the p-value of test $i\\) after sorting all p-values in ascending order, and \\(k\\) the number of tests.
The function will adjust the p-values using:
$$\\tilde{p}_i =\\begin{cases} \\min\\left(1, j\\times p_i\\right) & j exists \\\\ 1 & j does not exist \\end{cases} $$
References
----------
Benjamini, Y., & Hochberg, Y. (1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing. *Journal of the Royal Statistical Society Series B: Statistical Methodology, 57*(1), 289–300. doi:10.1111/j.2517-6161.1995.tb02031.x
Benjamini, Y., & Yekutieli, D. (2001). The control of the false discovery rate in multiple testing under dependency. *The Annals of Statistics, 29*(4), 1165–1188. doi:10.1214/aos/1013699998
Bonferroni, C. E. (1935). Il calcolo delle assicurazioni su gruppi di teste. In *Studi in Onore del Professore Salvatore Ortu Carboni* (pp. 13–60).
Bonferroni, C. E. (1936). *Teoria statistica delle classi e calcolo delle probabilità*. Pubblicazioni Del R Istituto Superiore Di Scienze Economiche e Commerciali Di Firenze, 8, 3–62.
Dunn, O. J. (1961). Multiple comparisons among means. *Journal of the American Statistical Association, 56*(293), 52–64. https://doi.org/10.1080/01621459.1961.10482090
Feller, W. (1950). *An introduction to probability theory and its applications: Vol. One*. John Wiley & Sons.
Hochberg, Y. (1988). A sharper Bonferroni procedure for multiple tests of significance. *Biometrika, 75*(4), 800–802. doi:10.1093/biomet/75.4.800
Holm, S. (1979). A simple sequentially rejective multiple test procedure. *Scandinavian Journal of Statistics, 6*(2), 65–70.
Hommel, G. (1988). A stagewise rejective multiple test procedure based on a modified Bonferroni test. *Biometrika, 75*(2), 383–386. doi:10.1093/biomet/75.2.383
SAS. (n.d.). PROC MULTTEST: p-Value Adjustments. SAS/STAT(R) 9.22 User’s Guide. Retrieved February 1, 2025, from https://support.sas.com/documentation/cdl/en/statug/63347/HTML/default/viewer.htm#statug_multtest_sect014.htm
Šidák, Z. (1967). Rectangular confidence regions for the means of multivariate normal distributions. *Journal of the American Statistical Association, 62*(318), 626. doi:10.2307/2283989
Wright, S. P. (1992). Adjusted p-values for simultaneous inference. *Biometrics, 48*(4), 1005. doi:10.2307/2532694
Author
------
Made by P. Stikker
Companion website: https://PeterStatistics.com
YouTube channel: https://www.youtube.com/stikpet
Donations: https://www.patreon.com/bePatron?u=19398076
'''
k = len(p_values)
if method is None:
p_adj_val = p_values
elif method=='bonferroni':
#calculate Bonferroni adjustment
p_adj_val = [min(1, i*k) for i in p_values]
elif method=='sidak':
#calculate the Sidak adjustments
p_adj_val = [min(1, 1-(1-i)**k) for i in p_values]
else:
#we need to keep the original order
p_data = [[i, p_values[i]] for i in range(0, k)]
#Sort the p-values
p_sorted = sorted(p_data, key=lambda x: x[1])
if method=='hommel':
ai = copy.deepcopy(p_sorted)
ci = [0]*k
for m in range(k, 1, -1):
c_min = 1
for i in range(k - m + 1, k+1):
ci[i-1] = m*p_sorted[i-1][1]/(m + i - k)
if ci[i-1] < c_min:
c_min = ci[i-1]
for i in range(k - m + 1, k+1):
if ai[i-1][1] < c_min:
ai[i-1][1] = c_min
if k - m + 1 > 0:
for i in range(1, k - m + 1):
ci[i-1] = min(c_min, m*p_sorted[i-1][1])
if ai[i-1][1] < ci[i-1]:
ai[i-1][1] = ci[i-1]
for i in range(0, k):
ai[i] = [p_sorted[0][0], p_sorted[0][1], ai[i][1]]
elif method=='holm':
#add the Holm adjusted p-values
ai = [[p_sorted[0][0], p_sorted[0][1], min(1, k*p_sorted[0][1])]]
for i in range(1, k):
ai.append([p_sorted[i][0], p_sorted[i][1], max(ai[i-1][2], min(1, p_sorted[i][1]*(k + 1 - (i + 1))))])
elif method=='holm-sidak':
ai = [[p_sorted[0][0], p_sorted[0][1], min(1, 1 - (1 - p_sorted[0][1])**k)]]
for i in range(1, k):
ai.append([p_sorted[i][0], p_sorted[i][1], max(ai[i-1][2], min(1, 1 - (1 - p_sorted[i][1])**(k + 1 - (i+1))))])
elif method=='hochberg':
#sort the data DESCENDING
p_sorted_desc = sorted(p_data, key=lambda x: x[1], reverse=True)
#add the Hochberg adjusted p-values and find the critical index
ai = [[p_sorted_desc[0][0], p_sorted_desc[0][1], p_sorted_desc[0][1]]]
for i in range(1, k):
ai.append([p_sorted_desc[i][0], p_sorted_desc[i][1], min(1, ai[i-1][2], p_sorted_desc[i][1]*(i+1))])
elif method=='bh':
#add the Benjamini-Hochberg adjusted p-values and find the critical index
ai = [0]*k
ai[k-1] = [p_sorted[k-1][0], p_sorted[k-1][1], p_sorted[k-1][1]]
for i in range(k-1, 0, -1):
ai[i-1] = [p_sorted[i-1][0], p_sorted[i-1][1], min(1, ai[i][2], p_sorted[i-1][1]*k/i)]
elif method=='by':
#add the Benjamini-Yekutieli adjusted p-values and find the critical index
ai = [0]*k
ck = sum([1/i for i in range(1, k+1)])
ai[k-1] = [p_sorted[k-1][0], p_sorted[k-1][1], min(1, p_sorted[k-1][1]*ck)]
for i in range(k-1, 0, -1):
ai[i-1] = [p_sorted[i-1][0], p_sorted[i-1][1], min(1, ai[i][2], p_sorted[i-1][1]*ck*k/i)]
elif method=='hommel-original':
#find the critical Hommel index
i_hommel = -1
i = 1
while i <= k and i_hommel== -1:
c_min = 1
for j in range(1, i):
if p_sorted[k - i + j -1][1]*i/j < c_min:
c_min = p_sorted[k - i + j -1][1]*i/j
if c_min > alpha:
i_hommel = i
i = i + 1
ai = []
if i_hommel > 0:
for i in range(0, k):
ai.append([p_sorted[i][0], p_sorted[i][0], min(1, p_sorted[i][1]*i_hommel)])
else:
for i in range(0, k):
ai.append([p_sorted[i][0], p_sorted[i][0], min(p_sorted[i][0], alpha)])
#sort back based on the id
p_res_sorted = sorted(ai, key=lambda x: x[0])
#separate the adj. p-values and conclusions
p_adj_val = []
for i in range(0, k):
p_adj_val.append(p_res_sorted[i][2])
#final result
return p_adj_val
Functions
def p_adjust(p_values, method='bonferroni', alpha=0.05)
-
P-Value Adjustments for Multiple Testing
Various methods exist to counter a problem with multiple testing. Bonferroni, Šidák, Hommel, Holm, Holm-Šidák, and Hochberg all attempt to control the family wise error rate (FWER), while Benjamini-Hochberg and Benjamini-Yekutieli attempt to control the false discovery rate (FDR).
FWER methods want to minimize the chance of making at least one Type I error (incorrectly rejecting the null hypothesis), while FDR methods attempt to balance the false positive and false negatives.
Parameters
p_values
:list
- the various p-values
method
:{'bonferroni', 'sidak', 'hommel', 'holm', 'holm-sidak', 'hochberg', 'bh', 'by', 'hommel-original', None} : string
, optional- method to use for adjustment, Default is 'bonferroni'.
alpha
:float
, optional- the alpha level to use, only applies to 'hommel-original'. Default is 0.05.
Returns
p_adj_val
:list
- the adjusted p-values
Notes
Bonferroni
The formula used for the Bonferroni adjustment: \tilde{p}_i = \min\left(1, p_i \times k\right)
Where $p_i$ is the p-value of test $i$, and $k$ the number of tests.
Dunn (1961, p. 53) uses the Bonferroni inequality to adjust confidence intervals, which is why this is also called the Dunn-Bonferroni adjustments.
Bonferroni describes these inequalities in two papers (1935, 1936), but unfortunately I do not read Italian. The term 'Bonferroni inequalities' can already be found in Feller (1950, p. 75)
Šidák
The formula used (Šidák, 1967, p. 629): \tilde{p}_i = \min\left(1, 1 - \left(1 - p_i\right)^k\right)
Where $p_i$ is the p-value of test $i$, and $k$ the number of tests.
Hommel
The algorithm used (Wright, 1992, p. 1013):
- Set a_i = p_i for all i
-
For each m = k, (k - 1), ..., 2 (i.e. in descending order) do the following:
2.1. For i > (k - m)
- 2.1.1. calculate c_i = \frac{m\times p_i}{m + i - k}
- 2.1.2. set c_{min} = \min\left(c_i, for all i\right)
- 2.1.3. if a_i < c_{min} then a_i = c_min
2.2. For i \leq (k - m)
- 2.2.1. let c_i = \min\left(c_{min}, m \times p_i\right)
- 2.2.2. if a_i < c_i then a_i = ci
-
p_i^{adj} = a_i
Where p_i is the p-value of test $i) after sorting all p-values in ascending order, and k the number of tests.
Hommel (1988) original procedure is I think slightly different and implemented in 'hommel-original'. The method from Wright seems to be the method used in the multipletests() function from the Python library statsmodels.stats.multitest, and the p.adjust() function from R's stats library. The advantage of Wright's algorithm, is that it doesn't require the alpha level to be known to adjust the p-values.
Holm
The formula used (SAS, n.d.): \tilde{p}_i =\begin{cases} k\times p_1 & i= 1\\ \max \left(\tilde{p}_{i-1}, p_i \times \left(k+1-i\right)\right) & i=2,\dots,k \end{cases}
Where p_i is the p-value of test $i) after sorting all p-values in ascending order, and k the number of tests.
Holm (1979, p. 67) describes this procedure, but uses alpha level.
Holm-Šidák
The formula used (SAS, n.d.): \tilde{p}_i =\begin{cases} 1 - \left(1 - p_1 \right)^k & i= 1\\ \max \left(\tilde{p}_{i-1}, 1 - \left(1 - p_i\right)^{k - i + 1}\right) & i=2,\dots,k \end{cases}
Where p_i is the p-value of test $i) after sorting all p-values in ascending order, and k the number of tests.
This uses Holm (1979, p. 67) step-down approach, but instead of using the Bonferroni adjustment, it uses Šidák.
Hochberg
The formula used (SAS, n.d.): \tilde{p}_i =\begin{cases} p_i & i= 1\\ \min \left(\tilde{p}_{i-1}, i \times p_i\right) & i=2,\dots,k \end{cases}
Where p_i is the p-value of test $i) after sorting all p-values in DESCENDING order, and k the number of tests.
The procedure is described by Hochberg (1988, p. 801) using alpha levels for the criteria.
Benjamini-Hochberg
The algorithm used (Benjamini & Hochberg, 1995, p. 293):
- Sort the p-values in ascending order
- find j = \max\left(i | p_i \times \frac{k}{i} \leq \alpha\right)
- All tests i \leq j are considered significant.
To find the adjusted p-values (in reverse order): \tilde{p}_i =\begin{cases} p_k & i= k\\ \min \left(\tilde{p}_{i+1}, p_i \times \frac{k}{i}\right) & i=(k-1),\dots,1 \end{cases}
Where p_i is the p-value of test $i) after sorting all p-values in ascending order, and k the number of tests.
Benjamini-Yekutieli
The algorithm used (Benjamini & Yekutieli, 2001, p. 1169):
- Sort the p-values in ascending order
- Determine C(k) = \sum_{i=1}^k \frac{1}{i}
- find j = \max\left(i | p_i \times \frac{k\times C(k)}{i} \leq \alpha\right)
- All tests i \leq j are considered significant.
To find the adjusted p-values (in reverse order): \tilde{p}_i =\begin{cases} p_k\times C(k) & i= k\\ \min \left(\tilde{p}_{i+1}, p_i \times \frac{k\times C(k)}{i}\right) & i=(k-1),\dots,1 \end{cases}
Where p_i is the p-value of test i after sorting all p-values in ascending order, and k the number of tests.
Hommel Original
Hommel (1988, p. 384) describes the following algorithm:
- Compute j = \max\left(i \in {1,\dots, k}| p_{k-i+j} > j\times\frac{\alpha}{i} for j=1,\dots, i\right)
- If the maximum does not exist, reject all, otherwise reject all with p_i \leq \frac{\alpha}{j}
Where p_i is the p-value of test $i) after sorting all p-values in ascending order, and k the number of tests.
The function will adjust the p-values using: \tilde{p}_i =\begin{cases} \min\left(1, j\times p_i\right) & j exists \\ 1 & j does not exist \end{cases}
References
Benjamini, Y., & Hochberg, Y. (1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing. Journal of the Royal Statistical Society Series B: Statistical Methodology, 57(1), 289–300. doi:10.1111/j.2517-6161.1995.tb02031.x
Benjamini, Y., & Yekutieli, D. (2001). The control of the false discovery rate in multiple testing under dependency. The Annals of Statistics, 29(4), 1165–1188. doi:10.1214/aos/1013699998
Bonferroni, C. E. (1935). Il calcolo delle assicurazioni su gruppi di teste. In Studi in Onore del Professore Salvatore Ortu Carboni (pp. 13–60).
Bonferroni, C. E. (1936). Teoria statistica delle classi e calcolo delle probabilità. Pubblicazioni Del R Istituto Superiore Di Scienze Economiche e Commerciali Di Firenze, 8, 3–62.
Dunn, O. J. (1961). Multiple comparisons among means. Journal of the American Statistical Association, 56(293), 52–64. https://doi.org/10.1080/01621459.1961.10482090
Feller, W. (1950). An introduction to probability theory and its applications: Vol. One. John Wiley & Sons.
Hochberg, Y. (1988). A sharper Bonferroni procedure for multiple tests of significance. Biometrika, 75(4), 800–802. doi:10.1093/biomet/75.4.800
Holm, S. (1979). A simple sequentially rejective multiple test procedure. Scandinavian Journal of Statistics, 6(2), 65–70.
Hommel, G. (1988). A stagewise rejective multiple test procedure based on a modified Bonferroni test. Biometrika, 75(2), 383–386. doi:10.1093/biomet/75.2.383
SAS. (n.d.). PROC MULTTEST: p-Value Adjustments. SAS/STAT(R) 9.22 User’s Guide. Retrieved February 1, 2025, from https://support.sas.com/documentation/cdl/en/statug/63347/HTML/default/viewer.htm#statug_multtest_sect014.htm
Šidák, Z. (1967). Rectangular confidence regions for the means of multivariate normal distributions. Journal of the American Statistical Association, 62(318), 626. doi:10.2307/2283989
Wright, S. P. (1992). Adjusted p-values for simultaneous inference. Biometrics, 48(4), 1005. doi:10.2307/2532694
Author
Made by P. Stikker
Companion website: https://PeterStatistics.com
YouTube channel: https://www.youtube.com/stikpet
Donations: https://www.patreon.com/bePatron?u=19398076Expand source code
def p_adjust(p_values, method="bonferroni", alpha=0.05): ''' P-Value Adjustments for Multiple Testing ---------------------------------------- Various methods exist to counter a problem with multiple testing. Bonferroni, Šidák, Hommel, Holm, Holm-Šidák, and Hochberg all attempt to control the family wise error rate (FWER), while Benjamini-Hochberg and Benjamini-Yekutieli attempt to control the false discovery rate (FDR). FWER methods want to minimize the chance of making at least one Type I error (incorrectly rejecting the null hypothesis), while FDR methods attempt to balance the false positive and false negatives. Parameters ---------- p_values : list the various p-values method : {'bonferroni', 'sidak', 'hommel', 'holm', 'holm-sidak', 'hochberg', 'bh', 'by', 'hommel-original', None} : string, optional method to use for adjustment, Default is 'bonferroni'. alpha : float, optional the alpha level to use, only applies to 'hommel-original'. Default is 0.05. Returns ------- p_adj_val : list the adjusted p-values Notes ----- **Bonferroni** The formula used for the Bonferroni adjustment: $$\\tilde{p}_i = \\min\\left(1, p_i \\times k\\right)$$ Where $p_i$ is the p-value of test $i$, and $k$ the number of tests. Dunn (1961, p. 53) uses the Bonferroni inequality to adjust confidence intervals, which is why this is also called the Dunn-Bonferroni adjustments. Bonferroni describes these inequalities in two papers (1935, 1936), but unfortunately I do not read Italian. The term 'Bonferroni inequalities' can already be found in Feller (1950, p. 75) **Šidák** The formula used (Šidák, 1967, p. 629): $$\\tilde{p}_i = \\min\\left(1, 1 - \\left(1 - p_i\\right)^k\\right)$$ Where $p_i$ is the p-value of test $i$, and $k$ the number of tests. **Hommel** The algorithm used (Wright, 1992, p. 1013): 1. Set \\(a_i = p_i\\) for all \\(i\\) 2. For each \\(m = k, (k - 1), ..., 2\\) (i.e. in descending order) do the following: 2.1. For \\(i > (k - m)\\) * 2.1.1. calculate \\(c_i = \\frac{m\\times p_i}{m + i - k}\\) * 2.1.2. set \\(c_{min} = \\min\\left(c_i, for all i\\right)\\) * 2.1.3. if \\(a_i < c_{min}\\) then \\(a_i = c_min\\) 2.2. For \\(i \\leq (k - m)\\) * 2.2.1. let \\(c_i = \\min\\left(c_{min}, m \\times p_i\\right)\\) * 2.2.2. if \\(a_i < c_i\\) then \\(a_i = ci\\) 3. \\(p_i^{adj} = a_i\\) Where \\(p_i\\) is the p-value of test $i\\) after sorting all p-values in ascending order, and \\(k\\) the number of tests. Hommel (1988) original procedure is I think slightly different and implemented in 'hommel-original'. The method from Wright seems to be the method used in the multipletests() function from the Python library statsmodels.stats.multitest, and the p.adjust() function from R's stats library. The advantage of Wright's algorithm, is that it doesn't require the alpha level to be known to adjust the p-values. **Holm** The formula used (SAS, n.d.): $$\\tilde{p}_i =\\begin{cases} k\\times p_1 & i= 1\\\\ \\max \\left(\\tilde{p}_{i-1}, p_i \\times \\left(k+1-i\\right)\\right) & i=2,\\dots,k \\end{cases} $$ Where \\(p_i\\) is the p-value of test $i\\) after sorting all p-values in ascending order, and \\(k\\) the number of tests. Holm (1979, p. 67) describes this procedure, but uses alpha level. **Holm-Šidák** The formula used (SAS, n.d.): $$\\tilde{p}_i =\\begin{cases} 1 - \\left(1 - p_1 \\right)^k & i= 1\\\\ \\max \\left(\\tilde{p}_{i-1}, 1 - \\left(1 - p_i\\right)^{k - i + 1}\\right) & i=2,\\dots,k \\end{cases} $$ Where \\(p_i\\) is the p-value of test $i\\) after sorting all p-values in ascending order, and \\(k\\) the number of tests. This uses Holm (1979, p. 67) step-down approach, but instead of using the Bonferroni adjustment, it uses Šidák. **Hochberg** The formula used (SAS, n.d.): $$\\tilde{p}_i =\\begin{cases} p_i & i= 1\\\\ \\min \\left(\\tilde{p}_{i-1}, i \\times p_i\\right) & i=2,\\dots,k \\end{cases} $$ Where \\(p_i\\) is the p-value of test $i\\) after sorting all p-values in DESCENDING order, and \\(k\\) the number of tests. The procedure is described by Hochberg (1988, p. 801) using alpha levels for the criteria. **Benjamini-Hochberg** The algorithm used (Benjamini & Hochberg, 1995, p. 293): 1. Sort the p-values in ascending order 2. find \\(j = \\max\\left(i | p_i \\times \\frac{k}{i} \\leq \\alpha\\right)\\) 3. All tests \\(i \\leq j\\) are considered significant. To find the adjusted p-values (in reverse order): $$\\tilde{p}_i =\\begin{cases} p_k & i= k\\\\ \\min \\left(\\tilde{p}_{i+1}, p_i \\times \\frac{k}{i}\\right) & i=(k-1),\\dots,1 \\end{cases} $$ Where \\(p_i\\) is the p-value of test $i\\) after sorting all p-values in ascending order, and \\(k\\) the number of tests. **Benjamini-Yekutieli** The algorithm used (Benjamini & Yekutieli, 2001, p. 1169): 1. Sort the p-values in ascending order 3. Determine \\(C(k) = \\sum_{i=1}^k \\frac{1}{i}\\) 2. find \\(j = \\max\\left(i | p_i \\times \\frac{k\\times C(k)}{i} \\leq \\alpha\\right)\\) 3. All tests \\(i \\leq j\\) are considered significant. To find the adjusted p-values (in reverse order): $$\\tilde{p}_i =\\begin{cases} p_k\\times C(k) & i= k\\\\ \\min \\left(\\tilde{p}_{i+1}, p_i \\times \\frac{k\\times C(k)}{i}\\right) & i=(k-1),\\dots,1 \\end{cases} $$ Where \\(p_i\\) is the p-value of test \\(i\\) after sorting all p-values in ascending order, and \\(k\\) the number of tests. **Hommel Original** Hommel (1988, p. 384) describes the following algorithm: 1. Compute \\(j = \\max\\left(i \\in {1,\\dots, k}| p_{k-i+j} > j\\times\\frac{\\alpha}{i} for j=1,\\dots, i\\right)\\) 2. If the maximum does not exist, reject all, otherwise reject all with \\(p_i \\leq \\frac{\\alpha}{j}\\) Where \\(p_i\\) is the p-value of test $i\\) after sorting all p-values in ascending order, and \\(k\\) the number of tests. The function will adjust the p-values using: $$\\tilde{p}_i =\\begin{cases} \\min\\left(1, j\\times p_i\\right) & j exists \\\\ 1 & j does not exist \\end{cases} $$ References ---------- Benjamini, Y., & Hochberg, Y. (1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing. *Journal of the Royal Statistical Society Series B: Statistical Methodology, 57*(1), 289–300. doi:10.1111/j.2517-6161.1995.tb02031.x Benjamini, Y., & Yekutieli, D. (2001). The control of the false discovery rate in multiple testing under dependency. *The Annals of Statistics, 29*(4), 1165–1188. doi:10.1214/aos/1013699998 Bonferroni, C. E. (1935). Il calcolo delle assicurazioni su gruppi di teste. In *Studi in Onore del Professore Salvatore Ortu Carboni* (pp. 13–60). Bonferroni, C. E. (1936). *Teoria statistica delle classi e calcolo delle probabilità*. Pubblicazioni Del R Istituto Superiore Di Scienze Economiche e Commerciali Di Firenze, 8, 3–62. Dunn, O. J. (1961). Multiple comparisons among means. *Journal of the American Statistical Association, 56*(293), 52–64. https://doi.org/10.1080/01621459.1961.10482090 Feller, W. (1950). *An introduction to probability theory and its applications: Vol. One*. John Wiley & Sons. Hochberg, Y. (1988). A sharper Bonferroni procedure for multiple tests of significance. *Biometrika, 75*(4), 800–802. doi:10.1093/biomet/75.4.800 Holm, S. (1979). A simple sequentially rejective multiple test procedure. *Scandinavian Journal of Statistics, 6*(2), 65–70. Hommel, G. (1988). A stagewise rejective multiple test procedure based on a modified Bonferroni test. *Biometrika, 75*(2), 383–386. doi:10.1093/biomet/75.2.383 SAS. (n.d.). PROC MULTTEST: p-Value Adjustments. SAS/STAT(R) 9.22 User’s Guide. Retrieved February 1, 2025, from https://support.sas.com/documentation/cdl/en/statug/63347/HTML/default/viewer.htm#statug_multtest_sect014.htm Šidák, Z. (1967). Rectangular confidence regions for the means of multivariate normal distributions. *Journal of the American Statistical Association, 62*(318), 626. doi:10.2307/2283989 Wright, S. P. (1992). Adjusted p-values for simultaneous inference. *Biometrics, 48*(4), 1005. doi:10.2307/2532694 Author ------ Made by P. Stikker Companion website: https://PeterStatistics.com YouTube channel: https://www.youtube.com/stikpet Donations: https://www.patreon.com/bePatron?u=19398076 ''' k = len(p_values) if method is None: p_adj_val = p_values elif method=='bonferroni': #calculate Bonferroni adjustment p_adj_val = [min(1, i*k) for i in p_values] elif method=='sidak': #calculate the Sidak adjustments p_adj_val = [min(1, 1-(1-i)**k) for i in p_values] else: #we need to keep the original order p_data = [[i, p_values[i]] for i in range(0, k)] #Sort the p-values p_sorted = sorted(p_data, key=lambda x: x[1]) if method=='hommel': ai = copy.deepcopy(p_sorted) ci = [0]*k for m in range(k, 1, -1): c_min = 1 for i in range(k - m + 1, k+1): ci[i-1] = m*p_sorted[i-1][1]/(m + i - k) if ci[i-1] < c_min: c_min = ci[i-1] for i in range(k - m + 1, k+1): if ai[i-1][1] < c_min: ai[i-1][1] = c_min if k - m + 1 > 0: for i in range(1, k - m + 1): ci[i-1] = min(c_min, m*p_sorted[i-1][1]) if ai[i-1][1] < ci[i-1]: ai[i-1][1] = ci[i-1] for i in range(0, k): ai[i] = [p_sorted[0][0], p_sorted[0][1], ai[i][1]] elif method=='holm': #add the Holm adjusted p-values ai = [[p_sorted[0][0], p_sorted[0][1], min(1, k*p_sorted[0][1])]] for i in range(1, k): ai.append([p_sorted[i][0], p_sorted[i][1], max(ai[i-1][2], min(1, p_sorted[i][1]*(k + 1 - (i + 1))))]) elif method=='holm-sidak': ai = [[p_sorted[0][0], p_sorted[0][1], min(1, 1 - (1 - p_sorted[0][1])**k)]] for i in range(1, k): ai.append([p_sorted[i][0], p_sorted[i][1], max(ai[i-1][2], min(1, 1 - (1 - p_sorted[i][1])**(k + 1 - (i+1))))]) elif method=='hochberg': #sort the data DESCENDING p_sorted_desc = sorted(p_data, key=lambda x: x[1], reverse=True) #add the Hochberg adjusted p-values and find the critical index ai = [[p_sorted_desc[0][0], p_sorted_desc[0][1], p_sorted_desc[0][1]]] for i in range(1, k): ai.append([p_sorted_desc[i][0], p_sorted_desc[i][1], min(1, ai[i-1][2], p_sorted_desc[i][1]*(i+1))]) elif method=='bh': #add the Benjamini-Hochberg adjusted p-values and find the critical index ai = [0]*k ai[k-1] = [p_sorted[k-1][0], p_sorted[k-1][1], p_sorted[k-1][1]] for i in range(k-1, 0, -1): ai[i-1] = [p_sorted[i-1][0], p_sorted[i-1][1], min(1, ai[i][2], p_sorted[i-1][1]*k/i)] elif method=='by': #add the Benjamini-Yekutieli adjusted p-values and find the critical index ai = [0]*k ck = sum([1/i for i in range(1, k+1)]) ai[k-1] = [p_sorted[k-1][0], p_sorted[k-1][1], min(1, p_sorted[k-1][1]*ck)] for i in range(k-1, 0, -1): ai[i-1] = [p_sorted[i-1][0], p_sorted[i-1][1], min(1, ai[i][2], p_sorted[i-1][1]*ck*k/i)] elif method=='hommel-original': #find the critical Hommel index i_hommel = -1 i = 1 while i <= k and i_hommel== -1: c_min = 1 for j in range(1, i): if p_sorted[k - i + j -1][1]*i/j < c_min: c_min = p_sorted[k - i + j -1][1]*i/j if c_min > alpha: i_hommel = i i = i + 1 ai = [] if i_hommel > 0: for i in range(0, k): ai.append([p_sorted[i][0], p_sorted[i][0], min(1, p_sorted[i][1]*i_hommel)]) else: for i in range(0, k): ai.append([p_sorted[i][0], p_sorted[i][0], min(p_sorted[i][0], alpha)]) #sort back based on the id p_res_sorted = sorted(ai, key=lambda x: x[0]) #separate the adj. p-values and conclusions p_adj_val = [] for i in range(0, k): p_adj_val.append(p_res_sorted[i][2]) #final result return p_adj_val