Module `stikpetP.other.poho_pairwise_bin`

Expand source code

import pandas as pd
from ..tests.test_binomial_os import ts_binomial_os
from ..tests.test_wald_os import ts_wald_os
from ..tests.test_score_os import ts_score_os
from ..other.p_adjustments import p_adjust

def ph_pairwise_bin(data, test="binomial", expCount=None, mtc='bonferroni', **kwargs):
    '''
    Pairwise Binary Test for Post-Hoc Analysis
    --------------------------------------------
    
    This function will perform a one-sample binary test for each possible pair in the data. This could either be a binomial, Wald or score test.

    The unadjusted p-values and Bonferroni adjusted p-values are both determined.

    This function is shown in this [YouTube video](https://youtu.be/0uY4VAbvGpQ) and the test is also described at [PeterStatistics.com](https://peterstatistics.com/Terms/Tests/PostHocAfterGoF.html)
    
    Parameters
    ----------
    data : list or pandas series
    test : {"binomial", "score", "wald"}, optional
        test to use for each pair
    expCount : pandas dataframe, optional 
        categories and expected counts
    mtc : string, optional
        any of the methods available in p_adjust() to correct for multiple tests
    **kwargs : optional
        additional arguments for the specific test that are passed along.
    
    Returns
    -------
    pandas.DataFrame
        A dataframe with the following columns:
    
        - *category 1* : the label of the first category
        - *category 2* : the label of the second category
        - *n1* : the sample size of the first category
        - *n2* : the sample size of the second category 
        - *obs. prop. 1* : the proportion in the sample of the first category
        - *exp. prop. 1* : the expected proportion for the first category
        - *p-value* : the unadjusted significance
        - *adj. p-value* : the adjusted significance

    Notes
    -----
    None.

    Before, After and Alternatives
    ------------------------------
    Before this an omnibus test might be helpful:
    * [ts_pearson_gof](../tests/test_pearson_gof.html#ts_pearson_gof) for Pearson Chi-Square Goodness-of-Fit Test
    * [ts_freeman_tukey_gof](../tests/test_freeman_tukey_gof.html#ts_freeman_tukey_gof) for Freeman-Tukey Test of Goodness-of-Fit
    * [ts_freeman_tukey_read](../tests/test_freeman_tukey_read.html#ts_freeman_tukey_read) for Freeman-Tukey-Read Test of Goodness-of-Fit
    * [ts_g_gof](../tests/test_g_gof.html#ts_g_gof) for G (Likelihood Ratio) Goodness-of-Fit Test
    * [ts_mod_log_likelihood_gof](../tests/test_mod_log_likelihood_gof.html#ts_mod_log_likelihood_gof) for Mod-Log Likelihood Test of Goodness-of-Fit
    * [ts_multinomial_gof](../tests/test_multinomial_gof.html#ts_multinomial_gof) for Multinomial Goodness-of-Fit Test
    * [ts_neyman_gof](../tests/test_neyman_gof.html#ts_neyman_gof) for Neyman Test of Goodness-of-Fit
    * [ts_powerdivergence_gof](../tests/test_powerdivergence_gof.html#ts_powerdivergence_gof) for Power Divergence GoF Test
    
    After this you might want to add an effect size measure:
    * [es_post_hoc_gof](../effect_sizes/eff_size_post_hoc_gof.html#es_post_hoc_gof) for various effect sizes
    
    Alternative post-hoc tests:
    * [ph_pairwise_gof](../other/poho_pairwise_gof.html#ph_pairwise_gof) for Pairwise Goodness-of-Fit Tests
    * [ph_residual_gof_bin](../other/poho_residual_gof_bin.html#ph_residual_gof_bin) for Residuals Tests
    * [ph_residual_gof_gof](../other/poho_residual_gof_gof.html#ph_residual_gof_gof) for Residuals Using Goodness-of-Fit Tests

    The binary test that is performed on each pair:
    * [ts_binomial_os](../tests/test_binomial_os.html#ts_binomial_os) for One-Sample Binomial Test
    * [ts_score_os](../tests/test_score_os.html#ts_score_os) for One-Sample Score Test
    * [ts_wald_os](../tests/test_wald_os.html#ts_wald_os) for One-Sample Wald Test

    More info on the adjustment for multiple testing:
    * [p_adjust](../other/p_adjustments.html#p_adjust)
    
    Author
    ------
    Made by P. Stikker
    
    Companion website: https://PeterStatistics.com  
    YouTube channel: https://www.youtube.com/stikpet  
    Donations: https://www.patreon.com/bePatron?u=19398076

    Examples
    --------
    Examples: get data
    >>> import pandas as pd
    >>> pd.set_option('display.width',1000)
    >>> pd.set_option('display.max_columns', 1000)    
    >>> gss_df = pd.read_csv('https://peterstatistics.com/Packages/ExampleData/GSS2012a.csv', sep=',', low_memory=False, storage_options={'User-Agent': 'Mozilla/5.0'})
    >>> ex1 = gss_df['mar1'];

    Example 1 using default settings:
    >>> ph_pairwise_bin(ex1)
          category 1     category 2     n1     n2  n pair  obs. prop. 1  exp. prop. 1 statistic        p-value   adj. p-value                                                                         test
    0        MARRIED  NEVER MARRIED  972.0  395.0  1367.0      0.711046           0.5      n.a.   1.052263e-56   1.052263e-55        one-sample binomial, with equal-distance method (with p0 for MARRIED)
    1        MARRIED       DIVORCED  972.0  314.0  1286.0      0.755832           0.5      n.a.   7.829174e-79   7.829174e-78        one-sample binomial, with equal-distance method (with p0 for MARRIED)
    2        MARRIED        WIDOWED  972.0  181.0  1153.0      0.843018           0.5      n.a.  1.407217e-131  1.407217e-130        one-sample binomial, with equal-distance method (with p0 for MARRIED)
    3        MARRIED      SEPARATED  972.0   79.0  1051.0      0.924833           0.5      n.a.  1.267980e-196  1.267980e-195        one-sample binomial, with equal-distance method (with p0 for MARRIED)
    4  NEVER MARRIED       DIVORCED  395.0  314.0   709.0      0.557123           0.5      n.a.   3.001933e-03   3.001933e-02  one-sample binomial, with equal-distance method (with p0 for NEVER MARRIED)
    5  NEVER MARRIED        WIDOWED  395.0  181.0   576.0      0.685764           0.5      n.a.   1.352112e-19   1.352112e-18  one-sample binomial, with equal-distance method (with p0 for NEVER MARRIED)
    6  NEVER MARRIED      SEPARATED  395.0   79.0   474.0      0.833333           0.5      n.a.   7.075688e-52   7.075688e-51  one-sample binomial, with equal-distance method (with p0 for NEVER MARRIED)
    7       DIVORCED        WIDOWED  314.0  181.0   495.0      0.634343           0.5      n.a.   3.295753e-09   3.295753e-08       one-sample binomial, with equal-distance method (with p0 for DIVORCED)
    8       DIVORCED      SEPARATED  314.0   79.0   393.0      0.798982           0.5      n.a.   1.472395e-34   1.472395e-33       one-sample binomial, with equal-distance method (with p0 for DIVORCED)
    9        WIDOWED      SEPARATED  181.0   79.0   260.0      0.696154           0.5      n.a.   2.223544e-10   2.223544e-09        one-sample binomial, with equal-distance method (with p0 for WIDOWED)

    Example 2 using a score test with Yates correction:
    >>> ph_pairwise_bin(ex1, test="score", mtc='holm', cc='yates')
          category 1     category 2     n1     n2  n pair  obs. prop. 1  exp. prop. 1  statistic       p-value  adj. p-value                                                                           test
    0        MARRIED  NEVER MARRIED  972.0  395.0  1367.0      0.711046           0.5 -15.578952  0.000000e+00  0.000000e+00        one-sample score with Yates continuity correction (with p0 for MARRIED)
    1        MARRIED       DIVORCED  972.0  314.0  1286.0      0.755832           0.5 -18.320819  0.000000e+00  0.000000e+00        one-sample score with Yates continuity correction (with p0 for MARRIED)
    2        MARRIED        WIDOWED  972.0  181.0  1153.0      0.843018           0.5 -23.265503  0.000000e+00  0.000000e+00        one-sample score with Yates continuity correction (with p0 for MARRIED)
    3        MARRIED      SEPARATED  972.0   79.0  1051.0      0.924833           0.5 -27.514619  0.000000e+00  0.000000e+00        one-sample score with Yates continuity correction (with p0 for MARRIED)
    4  NEVER MARRIED       DIVORCED  395.0  314.0   709.0      0.557123           0.5  -3.004463  2.660501e-03  2.660501e-03  one-sample score with Yates continuity correction (with p0 for NEVER MARRIED)
    5  NEVER MARRIED        WIDOWED  395.0  181.0   576.0      0.685764           0.5  -8.875000  0.000000e+00  0.000000e+00  one-sample score with Yates continuity correction (with p0 for NEVER MARRIED)
    6  NEVER MARRIED      SEPARATED  395.0   79.0   474.0      0.833333           0.5 -14.468429  0.000000e+00  0.000000e+00  one-sample score with Yates continuity correction (with p0 for NEVER MARRIED)
    7       DIVORCED        WIDOWED  314.0  181.0   495.0      0.634343           0.5  -5.932959  2.975236e-09  5.950471e-09       one-sample score with Yates continuity correction (with p0 for DIVORCED)
    8       DIVORCED      SEPARATED  314.0   79.0   393.0      0.798982           0.5 -11.803739  0.000000e+00  0.000000e+00       one-sample score with Yates continuity correction (with p0 for DIVORCED)
    9        WIDOWED      SEPARATED  181.0   79.0   260.0      0.696154           0.5  -6.263754  3.758180e-10  1.127454e-09        one-sample score with Yates continuity correction (with p0 for WIDOWED)
    
    '''
    if type(data) is list:
        data = pd.Series(data)
        
    freq = data.value_counts()
    
    if expCount is None:
        #assume all to be equal
        n = sum(freq)
        k = len(freq)
        categories = list(freq.index)
        expC = [n/k] * k
        
    else:
        #check if categories match
        nE = 0
        n = 0
        for i in range(0, len(expCount)):
            nE = nE + expCount.iloc[i,1]
            n = n + freq[expCount.iloc[i,0]]
        
        expC = []
        for i in range(0,len(expCount)):
            expC.append(expCount.iloc[i, 1]/nE*n)
            
        k = len(expC)
        categories = list(expCount.iloc[:,0])

    n_pairs = int(k*(k-1)/2)

    results = pd.DataFrame()
    resRow=0
    for i in range(0, k-1):
        for j in range(i+1, k):
            #category names
            results.at[resRow, 0] = categories[i]
            results.at[resRow, 1] = categories[j]
            #category sizes
            n1 = freq[categories[i]]
            n2 = freq[categories[j]]
            results.at[resRow, 2] = n1
            results.at[resRow, 3] = n2
            results.at[resRow, 4] = n1 + n2
    
            #observed and expected proportion
            obP1 = n1/(n1 + n2)
            exP1 = expC[i]/(expC[i]+expC[j])
            results.at[resRow, 5] = obP1
            results.at[resRow, 6] = exP1

            pair = [categories[i], categories[j]]
            
            if test=="binomial":
                # the test statistic
                results.at[resRow, 7] = "n.a."
                
                pair_test_result = ts_binomial_os(data, codes=pair, p0=exP1, **kwargs)
                # the p-value
                results.at[resRow, 8] = pair_test_result.iloc[0, 0]
    
                # the adj. p-value
                #fill something for the adjusted p-values
                results.at[resRow, 9] = results.at[resRow, 8]
                # description of test
                results.at[resRow, 10] = pair_test_result.iloc[0, 1]
    
            else:
                if test=="wald":
                    pair_test_result = ts_wald_os(data, codes=pair, p0=exP1, **kwargs)
                elif test=="score":
                    pair_test_result = ts_score_os(data, codes=pair, p0=exP1, **kwargs)
    
                # the test statistic
                results.at[resRow, 7] = pair_test_result.iloc[0, 1]
                
                # the p-value
                results.at[resRow, 8] = pair_test_result.iloc[0, 2]
                #fill something for the adjusted p-values
                results.at[resRow, 9] = results.at[resRow, 8]
                # description of test
                results.at[resRow, 10] = pair_test_result.iloc[0, 3]
            resRow = resRow + 1

    results.iloc[:,9] = p_adjust(results.iloc[:,8], method=mtc)
    
    results.columns = ["category 1", "category 2", "n1", "n2", "n pair", "obs. prop. 1", "exp. prop. 1", "statistic", "p-value", "adj. p-value", "test"]
    return results

Functions

def ph_pairwise_bin(data, test='binomial', expCount=None, mtc='bonferroni', **kwargs)

Pairwise Binary Test for Post-Hoc Analysis

This function will perform a one-sample binary test for each possible pair in the data. This could either be a binomial, Wald or score test.

The unadjusted p-values and Bonferroni adjusted p-values are both determined.

This function is shown in this YouTube video and the test is also described at PeterStatistics.com

Parameters

data : list or pandas series
test : {"binomial", "score", "wald"}, optional: test to use for each pair
expCount : pandas dataframe, optional: categories and expected counts
mtc : string, optional: any of the methods available in p_adjust() to correct for multiple tests
**kwargs : optional: additional arguments for the specific test that are passed along.

Returns

pandas.DataFrame

A dataframe with the following columns:

category 1 : the label of the first category
category 2 : the label of the second category
n1 : the sample size of the first category
n2 : the sample size of the second category
obs. prop. 1 : the proportion in the sample of the first category
exp. prop. 1 : the expected proportion for the first category
p-value : the unadjusted significance
adj. p-value : the adjusted significance

Notes

None.

Before, After and Alternatives

Before this an omnibus test might be helpful: * ts_pearson_gof for Pearson Chi-Square Goodness-of-Fit Test * ts_freeman_tukey_gof for Freeman-Tukey Test of Goodness-of-Fit * ts_freeman_tukey_read for Freeman-Tukey-Read Test of Goodness-of-Fit * ts_g_gof for G (Likelihood Ratio) Goodness-of-Fit Test * ts_mod_log_likelihood_gof for Mod-Log Likelihood Test of Goodness-of-Fit * ts_multinomial_gof for Multinomial Goodness-of-Fit Test * ts_neyman_gof for Neyman Test of Goodness-of-Fit * ts_powerdivergence_gof for Power Divergence GoF Test

After this you might want to add an effect size measure: * es_post_hoc_gof for various effect sizes

Alternative post-hoc tests: * ph_pairwise_gof for Pairwise Goodness-of-Fit Tests * ph_residual_gof_bin for Residuals Tests * ph_residual_gof_gof for Residuals Using Goodness-of-Fit Tests

The binary test that is performed on each pair: * ts_binomial_os for One-Sample Binomial Test * ts_score_os for One-Sample Score Test * ts_wald_os for One-Sample Wald Test

More info on the adjustment for multiple testing: * p_adjust

Author

Made by P. Stikker

Companion website: https://PeterStatistics.com
YouTube channel: https://www.youtube.com/stikpet
Donations: https://www.patreon.com/bePatron?u=19398076

Examples

Examples: get data

>>> import pandas as pd
>>> pd.set_option('display.width',1000)
>>> pd.set_option('display.max_columns', 1000)    
>>> gss_df = pd.read_csv('https://peterstatistics.com/Packages/ExampleData/GSS2012a.csv', sep=',', low_memory=False, storage_options={'User-Agent': 'Mozilla/5.0'})
>>> ex1 = gss_df['mar1'];

Example 1 using default settings:

>>> ph_pairwise_bin(ex1)
      category 1     category 2     n1     n2  n pair  obs. prop. 1  exp. prop. 1 statistic        p-value   adj. p-value                                                                         test
0        MARRIED  NEVER MARRIED  972.0  395.0  1367.0      0.711046           0.5      n.a.   1.052263e-56   1.052263e-55        one-sample binomial, with equal-distance method (with p0 for MARRIED)
1        MARRIED       DIVORCED  972.0  314.0  1286.0      0.755832           0.5      n.a.   7.829174e-79   7.829174e-78        one-sample binomial, with equal-distance method (with p0 for MARRIED)
2        MARRIED        WIDOWED  972.0  181.0  1153.0      0.843018           0.5      n.a.  1.407217e-131  1.407217e-130        one-sample binomial, with equal-distance method (with p0 for MARRIED)
3        MARRIED      SEPARATED  972.0   79.0  1051.0      0.924833           0.5      n.a.  1.267980e-196  1.267980e-195        one-sample binomial, with equal-distance method (with p0 for MARRIED)
4  NEVER MARRIED       DIVORCED  395.0  314.0   709.0      0.557123           0.5      n.a.   3.001933e-03   3.001933e-02  one-sample binomial, with equal-distance method (with p0 for NEVER MARRIED)
5  NEVER MARRIED        WIDOWED  395.0  181.0   576.0      0.685764           0.5      n.a.   1.352112e-19   1.352112e-18  one-sample binomial, with equal-distance method (with p0 for NEVER MARRIED)
6  NEVER MARRIED      SEPARATED  395.0   79.0   474.0      0.833333           0.5      n.a.   7.075688e-52   7.075688e-51  one-sample binomial, with equal-distance method (with p0 for NEVER MARRIED)
7       DIVORCED        WIDOWED  314.0  181.0   495.0      0.634343           0.5      n.a.   3.295753e-09   3.295753e-08       one-sample binomial, with equal-distance method (with p0 for DIVORCED)
8       DIVORCED      SEPARATED  314.0   79.0   393.0      0.798982           0.5      n.a.   1.472395e-34   1.472395e-33       one-sample binomial, with equal-distance method (with p0 for DIVORCED)
9        WIDOWED      SEPARATED  181.0   79.0   260.0      0.696154           0.5      n.a.   2.223544e-10   2.223544e-09        one-sample binomial, with equal-distance method (with p0 for WIDOWED)

Example 2 using a score test with Yates correction:

>>> ph_pairwise_bin(ex1, test="score", mtc='holm', cc='yates')
      category 1     category 2     n1     n2  n pair  obs. prop. 1  exp. prop. 1  statistic       p-value  adj. p-value                                                                           test
0        MARRIED  NEVER MARRIED  972.0  395.0  1367.0      0.711046           0.5 -15.578952  0.000000e+00  0.000000e+00        one-sample score with Yates continuity correction (with p0 for MARRIED)
1        MARRIED       DIVORCED  972.0  314.0  1286.0      0.755832           0.5 -18.320819  0.000000e+00  0.000000e+00        one-sample score with Yates continuity correction (with p0 for MARRIED)
2        MARRIED        WIDOWED  972.0  181.0  1153.0      0.843018           0.5 -23.265503  0.000000e+00  0.000000e+00        one-sample score with Yates continuity correction (with p0 for MARRIED)
3        MARRIED      SEPARATED  972.0   79.0  1051.0      0.924833           0.5 -27.514619  0.000000e+00  0.000000e+00        one-sample score with Yates continuity correction (with p0 for MARRIED)
4  NEVER MARRIED       DIVORCED  395.0  314.0   709.0      0.557123           0.5  -3.004463  2.660501e-03  2.660501e-03  one-sample score with Yates continuity correction (with p0 for NEVER MARRIED)
5  NEVER MARRIED        WIDOWED  395.0  181.0   576.0      0.685764           0.5  -8.875000  0.000000e+00  0.000000e+00  one-sample score with Yates continuity correction (with p0 for NEVER MARRIED)
6  NEVER MARRIED      SEPARATED  395.0   79.0   474.0      0.833333           0.5 -14.468429  0.000000e+00  0.000000e+00  one-sample score with Yates continuity correction (with p0 for NEVER MARRIED)
7       DIVORCED        WIDOWED  314.0  181.0   495.0      0.634343           0.5  -5.932959  2.975236e-09  5.950471e-09       one-sample score with Yates continuity correction (with p0 for DIVORCED)
8       DIVORCED      SEPARATED  314.0   79.0   393.0      0.798982           0.5 -11.803739  0.000000e+00  0.000000e+00       one-sample score with Yates continuity correction (with p0 for DIVORCED)
9        WIDOWED      SEPARATED  181.0   79.0   260.0      0.696154           0.5  -6.263754  3.758180e-10  1.127454e-09        one-sample score with Yates continuity correction (with p0 for WIDOWED)

Expand source code

def ph_pairwise_bin(data, test="binomial", expCount=None, mtc='bonferroni', **kwargs):
    '''
    Pairwise Binary Test for Post-Hoc Analysis
    --------------------------------------------
    
    This function will perform a one-sample binary test for each possible pair in the data. This could either be a binomial, Wald or score test.

    The unadjusted p-values and Bonferroni adjusted p-values are both determined.

    This function is shown in this [YouTube video](https://youtu.be/0uY4VAbvGpQ) and the test is also described at [PeterStatistics.com](https://peterstatistics.com/Terms/Tests/PostHocAfterGoF.html)
    
    Parameters
    ----------
    data : list or pandas series
    test : {"binomial", "score", "wald"}, optional
        test to use for each pair
    expCount : pandas dataframe, optional 
        categories and expected counts
    mtc : string, optional
        any of the methods available in p_adjust() to correct for multiple tests
    **kwargs : optional
        additional arguments for the specific test that are passed along.
    
    Returns
    -------
    pandas.DataFrame
        A dataframe with the following columns:
    
        - *category 1* : the label of the first category
        - *category 2* : the label of the second category
        - *n1* : the sample size of the first category
        - *n2* : the sample size of the second category 
        - *obs. prop. 1* : the proportion in the sample of the first category
        - *exp. prop. 1* : the expected proportion for the first category
        - *p-value* : the unadjusted significance
        - *adj. p-value* : the adjusted significance

    Notes
    -----
    None.

    Before, After and Alternatives
    ------------------------------
    Before this an omnibus test might be helpful:
    * [ts_pearson_gof](../tests/test_pearson_gof.html#ts_pearson_gof) for Pearson Chi-Square Goodness-of-Fit Test
    * [ts_freeman_tukey_gof](../tests/test_freeman_tukey_gof.html#ts_freeman_tukey_gof) for Freeman-Tukey Test of Goodness-of-Fit
    * [ts_freeman_tukey_read](../tests/test_freeman_tukey_read.html#ts_freeman_tukey_read) for Freeman-Tukey-Read Test of Goodness-of-Fit
    * [ts_g_gof](../tests/test_g_gof.html#ts_g_gof) for G (Likelihood Ratio) Goodness-of-Fit Test
    * [ts_mod_log_likelihood_gof](../tests/test_mod_log_likelihood_gof.html#ts_mod_log_likelihood_gof) for Mod-Log Likelihood Test of Goodness-of-Fit
    * [ts_multinomial_gof](../tests/test_multinomial_gof.html#ts_multinomial_gof) for Multinomial Goodness-of-Fit Test
    * [ts_neyman_gof](../tests/test_neyman_gof.html#ts_neyman_gof) for Neyman Test of Goodness-of-Fit
    * [ts_powerdivergence_gof](../tests/test_powerdivergence_gof.html#ts_powerdivergence_gof) for Power Divergence GoF Test
    
    After this you might want to add an effect size measure:
    * [es_post_hoc_gof](../effect_sizes/eff_size_post_hoc_gof.html#es_post_hoc_gof) for various effect sizes
    
    Alternative post-hoc tests:
    * [ph_pairwise_gof](../other/poho_pairwise_gof.html#ph_pairwise_gof) for Pairwise Goodness-of-Fit Tests
    * [ph_residual_gof_bin](../other/poho_residual_gof_bin.html#ph_residual_gof_bin) for Residuals Tests
    * [ph_residual_gof_gof](../other/poho_residual_gof_gof.html#ph_residual_gof_gof) for Residuals Using Goodness-of-Fit Tests

    The binary test that is performed on each pair:
    * [ts_binomial_os](../tests/test_binomial_os.html#ts_binomial_os) for One-Sample Binomial Test
    * [ts_score_os](../tests/test_score_os.html#ts_score_os) for One-Sample Score Test
    * [ts_wald_os](../tests/test_wald_os.html#ts_wald_os) for One-Sample Wald Test

    More info on the adjustment for multiple testing:
    * [p_adjust](../other/p_adjustments.html#p_adjust)
    
    Author
    ------
    Made by P. Stikker
    
    Companion website: https://PeterStatistics.com  
    YouTube channel: https://www.youtube.com/stikpet  
    Donations: https://www.patreon.com/bePatron?u=19398076

    Examples
    --------
    Examples: get data
    >>> import pandas as pd
    >>> pd.set_option('display.width',1000)
    >>> pd.set_option('display.max_columns', 1000)    
    >>> gss_df = pd.read_csv('https://peterstatistics.com/Packages/ExampleData/GSS2012a.csv', sep=',', low_memory=False, storage_options={'User-Agent': 'Mozilla/5.0'})
    >>> ex1 = gss_df['mar1'];

    Example 1 using default settings:
    >>> ph_pairwise_bin(ex1)
          category 1     category 2     n1     n2  n pair  obs. prop. 1  exp. prop. 1 statistic        p-value   adj. p-value                                                                         test
    0        MARRIED  NEVER MARRIED  972.0  395.0  1367.0      0.711046           0.5      n.a.   1.052263e-56   1.052263e-55        one-sample binomial, with equal-distance method (with p0 for MARRIED)
    1        MARRIED       DIVORCED  972.0  314.0  1286.0      0.755832           0.5      n.a.   7.829174e-79   7.829174e-78        one-sample binomial, with equal-distance method (with p0 for MARRIED)
    2        MARRIED        WIDOWED  972.0  181.0  1153.0      0.843018           0.5      n.a.  1.407217e-131  1.407217e-130        one-sample binomial, with equal-distance method (with p0 for MARRIED)
    3        MARRIED      SEPARATED  972.0   79.0  1051.0      0.924833           0.5      n.a.  1.267980e-196  1.267980e-195        one-sample binomial, with equal-distance method (with p0 for MARRIED)
    4  NEVER MARRIED       DIVORCED  395.0  314.0   709.0      0.557123           0.5      n.a.   3.001933e-03   3.001933e-02  one-sample binomial, with equal-distance method (with p0 for NEVER MARRIED)
    5  NEVER MARRIED        WIDOWED  395.0  181.0   576.0      0.685764           0.5      n.a.   1.352112e-19   1.352112e-18  one-sample binomial, with equal-distance method (with p0 for NEVER MARRIED)
    6  NEVER MARRIED      SEPARATED  395.0   79.0   474.0      0.833333           0.5      n.a.   7.075688e-52   7.075688e-51  one-sample binomial, with equal-distance method (with p0 for NEVER MARRIED)
    7       DIVORCED        WIDOWED  314.0  181.0   495.0      0.634343           0.5      n.a.   3.295753e-09   3.295753e-08       one-sample binomial, with equal-distance method (with p0 for DIVORCED)
    8       DIVORCED      SEPARATED  314.0   79.0   393.0      0.798982           0.5      n.a.   1.472395e-34   1.472395e-33       one-sample binomial, with equal-distance method (with p0 for DIVORCED)
    9        WIDOWED      SEPARATED  181.0   79.0   260.0      0.696154           0.5      n.a.   2.223544e-10   2.223544e-09        one-sample binomial, with equal-distance method (with p0 for WIDOWED)

    Example 2 using a score test with Yates correction:
    >>> ph_pairwise_bin(ex1, test="score", mtc='holm', cc='yates')
          category 1     category 2     n1     n2  n pair  obs. prop. 1  exp. prop. 1  statistic       p-value  adj. p-value                                                                           test
    0        MARRIED  NEVER MARRIED  972.0  395.0  1367.0      0.711046           0.5 -15.578952  0.000000e+00  0.000000e+00        one-sample score with Yates continuity correction (with p0 for MARRIED)
    1        MARRIED       DIVORCED  972.0  314.0  1286.0      0.755832           0.5 -18.320819  0.000000e+00  0.000000e+00        one-sample score with Yates continuity correction (with p0 for MARRIED)
    2        MARRIED        WIDOWED  972.0  181.0  1153.0      0.843018           0.5 -23.265503  0.000000e+00  0.000000e+00        one-sample score with Yates continuity correction (with p0 for MARRIED)
    3        MARRIED      SEPARATED  972.0   79.0  1051.0      0.924833           0.5 -27.514619  0.000000e+00  0.000000e+00        one-sample score with Yates continuity correction (with p0 for MARRIED)
    4  NEVER MARRIED       DIVORCED  395.0  314.0   709.0      0.557123           0.5  -3.004463  2.660501e-03  2.660501e-03  one-sample score with Yates continuity correction (with p0 for NEVER MARRIED)
    5  NEVER MARRIED        WIDOWED  395.0  181.0   576.0      0.685764           0.5  -8.875000  0.000000e+00  0.000000e+00  one-sample score with Yates continuity correction (with p0 for NEVER MARRIED)
    6  NEVER MARRIED      SEPARATED  395.0   79.0   474.0      0.833333           0.5 -14.468429  0.000000e+00  0.000000e+00  one-sample score with Yates continuity correction (with p0 for NEVER MARRIED)
    7       DIVORCED        WIDOWED  314.0  181.0   495.0      0.634343           0.5  -5.932959  2.975236e-09  5.950471e-09       one-sample score with Yates continuity correction (with p0 for DIVORCED)
    8       DIVORCED      SEPARATED  314.0   79.0   393.0      0.798982           0.5 -11.803739  0.000000e+00  0.000000e+00       one-sample score with Yates continuity correction (with p0 for DIVORCED)
    9        WIDOWED      SEPARATED  181.0   79.0   260.0      0.696154           0.5  -6.263754  3.758180e-10  1.127454e-09        one-sample score with Yates continuity correction (with p0 for WIDOWED)
    
    '''
    if type(data) is list:
        data = pd.Series(data)
        
    freq = data.value_counts()
    
    if expCount is None:
        #assume all to be equal
        n = sum(freq)
        k = len(freq)
        categories = list(freq.index)
        expC = [n/k] * k
        
    else:
        #check if categories match
        nE = 0
        n = 0
        for i in range(0, len(expCount)):
            nE = nE + expCount.iloc[i,1]
            n = n + freq[expCount.iloc[i,0]]
        
        expC = []
        for i in range(0,len(expCount)):
            expC.append(expCount.iloc[i, 1]/nE*n)
            
        k = len(expC)
        categories = list(expCount.iloc[:,0])

    n_pairs = int(k*(k-1)/2)

    results = pd.DataFrame()
    resRow=0
    for i in range(0, k-1):
        for j in range(i+1, k):
            #category names
            results.at[resRow, 0] = categories[i]
            results.at[resRow, 1] = categories[j]
            #category sizes
            n1 = freq[categories[i]]
            n2 = freq[categories[j]]
            results.at[resRow, 2] = n1
            results.at[resRow, 3] = n2
            results.at[resRow, 4] = n1 + n2
    
            #observed and expected proportion
            obP1 = n1/(n1 + n2)
            exP1 = expC[i]/(expC[i]+expC[j])
            results.at[resRow, 5] = obP1
            results.at[resRow, 6] = exP1

            pair = [categories[i], categories[j]]
            
            if test=="binomial":
                # the test statistic
                results.at[resRow, 7] = "n.a."
                
                pair_test_result = ts_binomial_os(data, codes=pair, p0=exP1, **kwargs)
                # the p-value
                results.at[resRow, 8] = pair_test_result.iloc[0, 0]
    
                # the adj. p-value
                #fill something for the adjusted p-values
                results.at[resRow, 9] = results.at[resRow, 8]
                # description of test
                results.at[resRow, 10] = pair_test_result.iloc[0, 1]
    
            else:
                if test=="wald":
                    pair_test_result = ts_wald_os(data, codes=pair, p0=exP1, **kwargs)
                elif test=="score":
                    pair_test_result = ts_score_os(data, codes=pair, p0=exP1, **kwargs)
    
                # the test statistic
                results.at[resRow, 7] = pair_test_result.iloc[0, 1]
                
                # the p-value
                results.at[resRow, 8] = pair_test_result.iloc[0, 2]
                #fill something for the adjusted p-values
                results.at[resRow, 9] = results.at[resRow, 8]
                # description of test
                results.at[resRow, 10] = pair_test_result.iloc[0, 3]
            resRow = resRow + 1

    results.iloc[:,9] = p_adjust(results.iloc[:,8], method=mtc)
    
    results.columns = ["category 1", "category 2", "n1", "n2", "n pair", "obs. prop. 1", "exp. prop. 1", "statistic", "p-value", "adj. p-value", "test"]
    return results