Module stikpetP.tests.test_c_square
Expand source code
import pandas as pd
from scipy.stats import chi2
def ts_c_square(bin_field, ord_field, categories=None, levels=None):
'''
C-square Test
-----------------------------------------
An improved version of the Brunner-Munzel test. It tests so-called stochastic dominance. It tests if you pick a random score from the first category, will there be a higher (or lower) chance than 50% it will be higher than a random score from the other category. Note this is not the same as a median test. A very short example.
If we have the scores 40, 50 and 60 in group A, and the scores 30, 50, and 51 in group B, the median of each group is 50. However if we pick 40 from group A, there is a 1/3 chance it is higher than a value from group B. If we pick 50 there is also a 1/3 chance, and if we pick 60 there is a 3/3 chance. Together, there is a (1/3+1/3+3/3)/3 = 5/9 chance that if you pick a random value from group A and B, that the one in group A is higher. This is quite off from the 50%.
The test could therefor be used if we have a binary and an ordinal variable. The Brunner-Munzel test is known not to be suitable in small sample sizes. This C-square test has no such limitation.
This function is shown in this [YouTube video](https://youtu.be/ndGxBskP9NI) and the test is also described at [PeterStatistics.com](https://peterstatistics.com/Terms/Tests/C-squareTest.html).
Parameters
----------
bin_field : pandas series
data with categories for the rows
ord_field : pandas series
data with the scores (ordinal field)
categories : list or dictionary, optional
the two categories to use from bin_field. If not set the first two found will be used
levels : list or dictionary, optional
the scores in order
Returns
-------
A dataframe with:
* *var. est.*, the estimated overall variance
* *test-statistic*, the Brunner-Munzel statistic
* *df*, the degrees of freedom
* *p-value*, the significance (p-value), two-tailed
* *categories*, the two categories that were used
Notes
-----
The test statistic (Schüürhuis et al., 2025, p. 9, eq. 17):
$$C^2 = \\frac{4}{\\hat{\\sigma}_N^2} \\times \\hat{\\theta}_N\\times\\left(1 - \\hat{\\theta}_N\\right) \\times \\left(\\hat{\\theta}_N - \\frac{1}{2}\\right)^2$$
the estimate of the overall variance of (Schüürhuis et al., 2025, p. 5, eq. 5):
$$\\hat{\\sigma}_N^2 = \\frac{1}{d_n}\\times\\left(\\left(\\sum_{k=1}^2 SS_k^*\\right) - n_1 \\times n_2 \\times \\left(\\hat{\\theta}_N\\times\\left(1 - \\hat{\\theta}_N\\right) - \\frac{\\hat{\\tau}_N}{4}\\right)\\right)$$
the sum of squares for each of the two categories of the placement values:
$$SS_k^* = \\sum_{i=1}^{n_k} \\left(R_{ik}^* - \\bar{R}_{k}^*\\right)^2$$
the mean of placement values (Schüürhuis et al., 2025, p. 5):
$$\\bar{R}_{k}^* = \\frac{\\sum_{i=1}^{n_k} R_{ik}^*}{n_k}$$
the placement values:
$$R_{ik}^* = R_{ik} - R_{ik}^{(k)}$$
Denominator (Schüürhuis et al., 2025, p. 5):
$$d_n = n_1\\times\\left(n_1 - 1\\right) \\times n_2\\times\\left(n_2 - 1\\right)$$
the probability of ties in the overlap (Schüürhuis et al., 2025, p. 5, eq. 4):
$$\\hat{\\tau}_N = \\frac{1}{n_1}\\times\\left(\\bar{R}_2^+ - \\bar{R}_2^- - \\left(\\bar{R}_2^{(2)+} - \\bar{R}_2^{(2)-}\\right)\\right)$$
estimated Mann-Whitney effect (Schüürhuis et al., 2025, p. 4, eq. 1):
$$\\hat{\\theta} = \\frac{1}{n_1} \\times \\left(\\bar{R}_2 - \\frac{n_2 + 1}{2}\\right)$$
note that this is the same as $\\hat{p}$ in the Brunner-Munzel test.
some means from the second category:
$$\\bar{R}_2 = \\frac{\\sum_{i=1}^{n_2} R_{i2}}{n_2}, \\bar{R}_2^+ = \\frac{\\sum_{i=1}^{n_2} R_{i2}^+}{n_2}, \\bar{R}_2^{(2)+} = \\frac{\\sum_{i=1}^{n_2} R_{i2}^{(2)+}}{n_2}, \\bar{R}_2^{(2)-} = \\frac{\\sum_{i=1}^{n_2} R_{i2}^{(2)-}}{n_2}$$
*Symbols used:*
* \\(R_{ik}\\), the mid rank of the i-th score in category k, when using all combined scores
* \\(R_{ik}^{(k)}\\), the mid-rank of the i-th score in category k, when using only scores from category k
* \\(R_{ik}^-\\), the minimum rank of the i-th score in category k, when using all combined scores
* \\(R_{ik}^{(k)-}\\), the minimum-rank of the i-th score in category k, when using only scores from category k
* \\(R_{ik}^+\\), the maximum rank of the i-th score in category k, when using all combined scores
* \\(R_{ik}^{(k)+}\\), the maximum-rank of the i-th score in category k, when using only scores from category k
* \\(N\\), the total sample size
* \\(n_{k}\\), the number of scores in category k
Before, After and Alternatives
------------------------------
Before running the test you might first want to get an impression using a cross table and/or some visualisation:
* [tab_cross](../other/table_cross.html#tab_cross)
* [vi_bar_stacked_multiple](../visualisations/vis_bar_stacked_multiple.html#vi_bar_stacked_multiple)
After the test you might want an effect size measure:
* [es_common_language_is](../effect_sizes/eff_size_common_language_is.html#es_common_language_is) for Common Language Effect Size
* [me_hodges_lehmann_is](../measures/meas_hodges_lehmann_is.html#me_hodges_lehmann_is) for Hodges-Lehmann estimate
* [r_rank_biserial_is](../correlations/cor_rank_biserial_is.html#r_rank_biserial_is) for (Glass) Rank Biserial (Cliff delta)
Alternatives for this test could be:
* [ts_mann_whitney](../tests/test_mann_whitney.html#ts_mann_whitney) for the Mann-Whitney U test
* [ts_fligner_policello](../tests/test_fligner_policello.html#ts_fligner_policello) for the Fligner-Policello test
* [ts_brunner_munzel](../tests/test_brunner_munzel.html#ts_brunner_munzel) for the Brunner-Munzel test
* [ts_brunner_munzel_perm](../tests/test_brunner_munzel.html#ts_brunner_munzel_perm) for the Brunner-Munzel Permutation test
References
----------
Schüürhuis, S., Konietschke, F., & Brunner, E. (2025). A new approach to the Nonparametric Behrens-Fisher problem with compatible confidence intervals (arXiv:2504.01796). arXiv. https://doi.org/10.48550/arXiv.2504.01796
Author
------
Made by P. Stikker
Companion website: https://PeterStatistics.com
YouTube channel: https://www.youtube.com/stikpet
Donations: https://www.patreon.com/bePatron?u=19398076
Examples
--------
>>> file1 = "https://peterstatistics.com/Packages/ExampleData/StudentStatistics.csv"
>>> df1 = pd.read_csv(file1, sep=';', low_memory=False, storage_options={'User-Agent': 'Mozilla/5.0'})
>>> coding = {'Far too little': 1, 'too little': 2, 'Enough': 3, 'Too much': 4, 'Far too much': 5}
>>> ts_c_square(df1['Gen_Gender'], df1['Mix_NrAct'], levels=coding)
var. est. test statistic df p-value categories
0 0.003434 14.595231 1 0.000133 Male, Female
'''
# DATA PREPARATION
#convert to pandas series if needed
if type(bin_field) is list:
bin_field = pd.Series(bin_field)
if type(ord_field) is list:
ord_field = pd.Series(ord_field)
#combine as one dataframe
df = pd.concat([bin_field, ord_field], axis=1)
df = df.dropna()
df.columns=['cat', 'score']
#replace the ordinal values if levels is provided
if levels is not None:
df['score'] = df['score'].map(levels).astype('Int8')
else:
df.iloc[:,1] = pd.to_numeric(df.iloc[:,1] )
#the two categories
if categories is not None:
cat1 = categories[0]
cat2 = categories[1]
else:
cat1 = df.iloc[:,0].value_counts().index[0]
cat2 = df.iloc[:,0].value_counts().index[1]
# remove scores not in either of the two categories
df = df[df['cat'].isin([cat1, cat2])]
df['mid-rank'] = df['score'].rank()
df['min-rank'] = df['score'].rank(method='min')
df['max-rank'] = df['score'].rank(method='max')
df['mid-rank_within'] = df.groupby('cat')['score'].rank()
nk = df['cat'].value_counts()
df_2 = df[df['cat'].isin([cat2])].copy()
df_2['min-rank_within'] = df_2['score'].rank(method='min')
df_2['max-rank_within'] = df_2['score'].rank(method='max')
barR2 = df_2['mid-rank'].mean()
barR2plus = df_2['max-rank'].mean()
barR2min = df_2['min-rank'].mean()
barR2plus2 = df_2['max-rank_within'].mean()
barR2min2 = df_2['min-rank_within'].mean()
thetaN = 1/nk[cat1]*(barR2 - (nk[cat2] + 1)/2)
tauN = 1/nk[cat1]*(barR2plus - barR2min - (barR2plus2 - barR2min2))
dn = nk[cat1]*(nk[cat1] - 1)*nk[cat2]*(nk[cat2] - 1)
df['placement'] = df['mid-rank'] - df['mid-rank_within']
SSrAst = df.groupby('cat')['placement'].var(ddof=0)*nk
varN = 1/dn*(sum(SSrAst) - nk[cat1]*nk[cat2]*(thetaN*(1 - thetaN) - tauN/4))
Csq = 1/varN*4*thetaN*(1 - thetaN)*(thetaN - 1/2)**2
p_val = chi2.sf(Csq, 1)
results = {'var. est.':varN, 'test statistic':Csq, 'df':1, 'p-value':p_val, 'categories':cat1 + ', ' + cat2}
results_pd = pd.DataFrame([results])
return results_pd
Functions
def ts_c_square(bin_field, ord_field, categories=None, levels=None)-
C-square Test
An improved version of the Brunner-Munzel test. It tests so-called stochastic dominance. It tests if you pick a random score from the first category, will there be a higher (or lower) chance than 50% it will be higher than a random score from the other category. Note this is not the same as a median test. A very short example.
If we have the scores 40, 50 and 60 in group A, and the scores 30, 50, and 51 in group B, the median of each group is 50. However if we pick 40 from group A, there is a 1/3 chance it is higher than a value from group B. If we pick 50 there is also a 1/3 chance, and if we pick 60 there is a 3/3 chance. Together, there is a (1/3+1/3+3/3)/3 = 5/9 chance that if you pick a random value from group A and B, that the one in group A is higher. This is quite off from the 50%.
The test could therefor be used if we have a binary and an ordinal variable. The Brunner-Munzel test is known not to be suitable in small sample sizes. This C-square test has no such limitation.
This function is shown in this YouTube video and the test is also described at PeterStatistics.com.
Parameters
bin_field:pandas series- data with categories for the rows
ord_field:pandas series- data with the scores (ordinal field)
categories:listordictionary, optional- the two categories to use from bin_field. If not set the first two found will be used
levels:listordictionary, optional- the scores in order
Returns
A dataframe with:
- var. est., the estimated overall variance
- test-statistic, the Brunner-Munzel statistic
- df, the degrees of freedom
- p-value, the significance (p-value), two-tailed
- categories, the two categories that were used
Notes
The test statistic (Schüürhuis et al., 2025, p. 9, eq. 17): C^2 = \frac{4}{\hat{\sigma}_N^2} \times \hat{\theta}_N\times\left(1 - \hat{\theta}_N\right) \times \left(\hat{\theta}_N - \frac{1}{2}\right)^2
the estimate of the overall variance of (Schüürhuis et al., 2025, p. 5, eq. 5): \hat{\sigma}_N^2 = \frac{1}{d_n}\times\left(\left(\sum_{k=1}^2 SS_k^*\right) - n_1 \times n_2 \times \left(\hat{\theta}_N\times\left(1 - \hat{\theta}_N\right) - \frac{\hat{\tau}_N}{4}\right)\right)
the sum of squares for each of the two categories of the placement values: SS_k^* = \sum_{i=1}^{n_k} \left(R_{ik}^* - \bar{R}_{k}^*\right)^2
the mean of placement values (Schüürhuis et al., 2025, p. 5): \bar{R}_{k}^* = \frac{\sum_{i=1}^{n_k} R_{ik}^*}{n_k}
the placement values: R_{ik}^* = R_{ik} - R_{ik}^{(k)}
Denominator (Schüürhuis et al., 2025, p. 5): d_n = n_1\times\left(n_1 - 1\right) \times n_2\times\left(n_2 - 1\right)
the probability of ties in the overlap (Schüürhuis et al., 2025, p. 5, eq. 4): \hat{\tau}_N = \frac{1}{n_1}\times\left(\bar{R}_2^+ - \bar{R}_2^- - \left(\bar{R}_2^{(2)+} - \bar{R}_2^{(2)-}\right)\right)
estimated Mann-Whitney effect (Schüürhuis et al., 2025, p. 4, eq. 1): \hat{\theta} = \frac{1}{n_1} \times \left(\bar{R}_2 - \frac{n_2 + 1}{2}\right) note that this is the same as $\hat{p}$ in the Brunner-Munzel test.
some means from the second category: \bar{R}_2 = \frac{\sum_{i=1}^{n_2} R_{i2}}{n_2}, \bar{R}_2^+ = \frac{\sum_{i=1}^{n_2} R_{i2}^+}{n_2}, \bar{R}_2^{(2)+} = \frac{\sum_{i=1}^{n_2} R_{i2}^{(2)+}}{n_2}, \bar{R}_2^{(2)-} = \frac{\sum_{i=1}^{n_2} R_{i2}^{(2)-}}{n_2}
Symbols used: * R_{ik}, the mid rank of the i-th score in category k, when using all combined scores * R_{ik}^{(k)}, the mid-rank of the i-th score in category k, when using only scores from category k * R_{ik}^-, the minimum rank of the i-th score in category k, when using all combined scores * R_{ik}^{(k)-}, the minimum-rank of the i-th score in category k, when using only scores from category k * R_{ik}^+, the maximum rank of the i-th score in category k, when using all combined scores * R_{ik}^{(k)+}, the maximum-rank of the i-th score in category k, when using only scores from category k * N, the total sample size * n_{k}, the number of scores in category k
Before, After and Alternatives
Before running the test you might first want to get an impression using a cross table and/or some visualisation:
After the test you might want an effect size measure:
- es_common_language_is for Common Language Effect Size
- me_hodges_lehmann_is for Hodges-Lehmann estimate
- r_rank_biserial_is for (Glass) Rank Biserial (Cliff delta)
Alternatives for this test could be:
- ts_mann_whitney for the Mann-Whitney U test
- ts_fligner_policello for the Fligner-Policello test
- ts_brunner_munzel for the Brunner-Munzel test
- ts_brunner_munzel_perm for the Brunner-Munzel Permutation test
References
Schüürhuis, S., Konietschke, F., & Brunner, E. (2025). A new approach to the Nonparametric Behrens-Fisher problem with compatible confidence intervals (arXiv:2504.01796). arXiv. https://doi.org/10.48550/arXiv.2504.01796
Author
Made by P. Stikker
Companion website: https://PeterStatistics.com
YouTube channel: https://www.youtube.com/stikpet
Donations: https://www.patreon.com/bePatron?u=19398076Examples
>>> file1 = "https://peterstatistics.com/Packages/ExampleData/StudentStatistics.csv" >>> df1 = pd.read_csv(file1, sep=';', low_memory=False, storage_options={'User-Agent': 'Mozilla/5.0'}) >>> coding = {'Far too little': 1, 'too little': 2, 'Enough': 3, 'Too much': 4, 'Far too much': 5} >>> ts_c_square(df1['Gen_Gender'], df1['Mix_NrAct'], levels=coding) var. est. test statistic df p-value categories 0 0.003434 14.595231 1 0.000133 Male, FemaleExpand source code
def ts_c_square(bin_field, ord_field, categories=None, levels=None): ''' C-square Test ----------------------------------------- An improved version of the Brunner-Munzel test. It tests so-called stochastic dominance. It tests if you pick a random score from the first category, will there be a higher (or lower) chance than 50% it will be higher than a random score from the other category. Note this is not the same as a median test. A very short example. If we have the scores 40, 50 and 60 in group A, and the scores 30, 50, and 51 in group B, the median of each group is 50. However if we pick 40 from group A, there is a 1/3 chance it is higher than a value from group B. If we pick 50 there is also a 1/3 chance, and if we pick 60 there is a 3/3 chance. Together, there is a (1/3+1/3+3/3)/3 = 5/9 chance that if you pick a random value from group A and B, that the one in group A is higher. This is quite off from the 50%. The test could therefor be used if we have a binary and an ordinal variable. The Brunner-Munzel test is known not to be suitable in small sample sizes. This C-square test has no such limitation. This function is shown in this [YouTube video](https://youtu.be/ndGxBskP9NI) and the test is also described at [PeterStatistics.com](https://peterstatistics.com/Terms/Tests/C-squareTest.html). Parameters ---------- bin_field : pandas series data with categories for the rows ord_field : pandas series data with the scores (ordinal field) categories : list or dictionary, optional the two categories to use from bin_field. If not set the first two found will be used levels : list or dictionary, optional the scores in order Returns ------- A dataframe with: * *var. est.*, the estimated overall variance * *test-statistic*, the Brunner-Munzel statistic * *df*, the degrees of freedom * *p-value*, the significance (p-value), two-tailed * *categories*, the two categories that were used Notes ----- The test statistic (Schüürhuis et al., 2025, p. 9, eq. 17): $$C^2 = \\frac{4}{\\hat{\\sigma}_N^2} \\times \\hat{\\theta}_N\\times\\left(1 - \\hat{\\theta}_N\\right) \\times \\left(\\hat{\\theta}_N - \\frac{1}{2}\\right)^2$$ the estimate of the overall variance of (Schüürhuis et al., 2025, p. 5, eq. 5): $$\\hat{\\sigma}_N^2 = \\frac{1}{d_n}\\times\\left(\\left(\\sum_{k=1}^2 SS_k^*\\right) - n_1 \\times n_2 \\times \\left(\\hat{\\theta}_N\\times\\left(1 - \\hat{\\theta}_N\\right) - \\frac{\\hat{\\tau}_N}{4}\\right)\\right)$$ the sum of squares for each of the two categories of the placement values: $$SS_k^* = \\sum_{i=1}^{n_k} \\left(R_{ik}^* - \\bar{R}_{k}^*\\right)^2$$ the mean of placement values (Schüürhuis et al., 2025, p. 5): $$\\bar{R}_{k}^* = \\frac{\\sum_{i=1}^{n_k} R_{ik}^*}{n_k}$$ the placement values: $$R_{ik}^* = R_{ik} - R_{ik}^{(k)}$$ Denominator (Schüürhuis et al., 2025, p. 5): $$d_n = n_1\\times\\left(n_1 - 1\\right) \\times n_2\\times\\left(n_2 - 1\\right)$$ the probability of ties in the overlap (Schüürhuis et al., 2025, p. 5, eq. 4): $$\\hat{\\tau}_N = \\frac{1}{n_1}\\times\\left(\\bar{R}_2^+ - \\bar{R}_2^- - \\left(\\bar{R}_2^{(2)+} - \\bar{R}_2^{(2)-}\\right)\\right)$$ estimated Mann-Whitney effect (Schüürhuis et al., 2025, p. 4, eq. 1): $$\\hat{\\theta} = \\frac{1}{n_1} \\times \\left(\\bar{R}_2 - \\frac{n_2 + 1}{2}\\right)$$ note that this is the same as $\\hat{p}$ in the Brunner-Munzel test. some means from the second category: $$\\bar{R}_2 = \\frac{\\sum_{i=1}^{n_2} R_{i2}}{n_2}, \\bar{R}_2^+ = \\frac{\\sum_{i=1}^{n_2} R_{i2}^+}{n_2}, \\bar{R}_2^{(2)+} = \\frac{\\sum_{i=1}^{n_2} R_{i2}^{(2)+}}{n_2}, \\bar{R}_2^{(2)-} = \\frac{\\sum_{i=1}^{n_2} R_{i2}^{(2)-}}{n_2}$$ *Symbols used:* * \\(R_{ik}\\), the mid rank of the i-th score in category k, when using all combined scores * \\(R_{ik}^{(k)}\\), the mid-rank of the i-th score in category k, when using only scores from category k * \\(R_{ik}^-\\), the minimum rank of the i-th score in category k, when using all combined scores * \\(R_{ik}^{(k)-}\\), the minimum-rank of the i-th score in category k, when using only scores from category k * \\(R_{ik}^+\\), the maximum rank of the i-th score in category k, when using all combined scores * \\(R_{ik}^{(k)+}\\), the maximum-rank of the i-th score in category k, when using only scores from category k * \\(N\\), the total sample size * \\(n_{k}\\), the number of scores in category k Before, After and Alternatives ------------------------------ Before running the test you might first want to get an impression using a cross table and/or some visualisation: * [tab_cross](../other/table_cross.html#tab_cross) * [vi_bar_stacked_multiple](../visualisations/vis_bar_stacked_multiple.html#vi_bar_stacked_multiple) After the test you might want an effect size measure: * [es_common_language_is](../effect_sizes/eff_size_common_language_is.html#es_common_language_is) for Common Language Effect Size * [me_hodges_lehmann_is](../measures/meas_hodges_lehmann_is.html#me_hodges_lehmann_is) for Hodges-Lehmann estimate * [r_rank_biserial_is](../correlations/cor_rank_biserial_is.html#r_rank_biserial_is) for (Glass) Rank Biserial (Cliff delta) Alternatives for this test could be: * [ts_mann_whitney](../tests/test_mann_whitney.html#ts_mann_whitney) for the Mann-Whitney U test * [ts_fligner_policello](../tests/test_fligner_policello.html#ts_fligner_policello) for the Fligner-Policello test * [ts_brunner_munzel](../tests/test_brunner_munzel.html#ts_brunner_munzel) for the Brunner-Munzel test * [ts_brunner_munzel_perm](../tests/test_brunner_munzel.html#ts_brunner_munzel_perm) for the Brunner-Munzel Permutation test References ---------- Schüürhuis, S., Konietschke, F., & Brunner, E. (2025). A new approach to the Nonparametric Behrens-Fisher problem with compatible confidence intervals (arXiv:2504.01796). arXiv. https://doi.org/10.48550/arXiv.2504.01796 Author ------ Made by P. Stikker Companion website: https://PeterStatistics.com YouTube channel: https://www.youtube.com/stikpet Donations: https://www.patreon.com/bePatron?u=19398076 Examples -------- >>> file1 = "https://peterstatistics.com/Packages/ExampleData/StudentStatistics.csv" >>> df1 = pd.read_csv(file1, sep=';', low_memory=False, storage_options={'User-Agent': 'Mozilla/5.0'}) >>> coding = {'Far too little': 1, 'too little': 2, 'Enough': 3, 'Too much': 4, 'Far too much': 5} >>> ts_c_square(df1['Gen_Gender'], df1['Mix_NrAct'], levels=coding) var. est. test statistic df p-value categories 0 0.003434 14.595231 1 0.000133 Male, Female ''' # DATA PREPARATION #convert to pandas series if needed if type(bin_field) is list: bin_field = pd.Series(bin_field) if type(ord_field) is list: ord_field = pd.Series(ord_field) #combine as one dataframe df = pd.concat([bin_field, ord_field], axis=1) df = df.dropna() df.columns=['cat', 'score'] #replace the ordinal values if levels is provided if levels is not None: df['score'] = df['score'].map(levels).astype('Int8') else: df.iloc[:,1] = pd.to_numeric(df.iloc[:,1] ) #the two categories if categories is not None: cat1 = categories[0] cat2 = categories[1] else: cat1 = df.iloc[:,0].value_counts().index[0] cat2 = df.iloc[:,0].value_counts().index[1] # remove scores not in either of the two categories df = df[df['cat'].isin([cat1, cat2])] df['mid-rank'] = df['score'].rank() df['min-rank'] = df['score'].rank(method='min') df['max-rank'] = df['score'].rank(method='max') df['mid-rank_within'] = df.groupby('cat')['score'].rank() nk = df['cat'].value_counts() df_2 = df[df['cat'].isin([cat2])].copy() df_2['min-rank_within'] = df_2['score'].rank(method='min') df_2['max-rank_within'] = df_2['score'].rank(method='max') barR2 = df_2['mid-rank'].mean() barR2plus = df_2['max-rank'].mean() barR2min = df_2['min-rank'].mean() barR2plus2 = df_2['max-rank_within'].mean() barR2min2 = df_2['min-rank_within'].mean() thetaN = 1/nk[cat1]*(barR2 - (nk[cat2] + 1)/2) tauN = 1/nk[cat1]*(barR2plus - barR2min - (barR2plus2 - barR2min2)) dn = nk[cat1]*(nk[cat1] - 1)*nk[cat2]*(nk[cat2] - 1) df['placement'] = df['mid-rank'] - df['mid-rank_within'] SSrAst = df.groupby('cat')['placement'].var(ddof=0)*nk varN = 1/dn*(sum(SSrAst) - nk[cat1]*nk[cat2]*(thetaN*(1 - thetaN) - tauN/4)) Csq = 1/varN*4*thetaN*(1 - thetaN)*(thetaN - 1/2)**2 p_val = chi2.sf(Csq, 1) results = {'var. est.':varN, 'test statistic':Csq, 'df':1, 'p-value':p_val, 'categories':cat1 + ', ' + cat2} results_pd = pd.DataFrame([results]) return results_pd