Module `stikpetP.tests.test_wilcoxon_ps`

Expand source code

import pandas as pd
from ..tests.test_wilcoxon_os import ts_wilcoxon_os

def ts_wilcoxon_ps(field1, field2, levels=None, dmu=0, 
                   appr = "wilcoxon", 
                   noDiff = "wilcoxon", 
                   ties = True, 
                   cc = False):
    '''
    Wilcoxon Signed Rank Test (Paired Samples)
    ------------------------------------------
    The paired-sample Wilcoxon signed rank test is often considered the non-parametric version of a paired-samples t-test. It can be used to determine if the median is significantly different between the two variables. It actually doesn't always tests this specifically, but more if the mean rank is significantly different.
    
    If the p-value is the probability of a result as in the sample, or more extreme, if the assumption about the population would be true. If this is below a certain threshold (usually 0.05) the assumption about the population is rejected. 
    
    Results in software packages for this test can vary, since there are a few different approaches. Especially if there are so-called ties.
    
    This function simply determines the differences between the two provided variables, and then passes these differences along to the one-sample version. See ts_wilcoxon_os() for details on this.
    
    Parameters
    ----------
    field1 : pandas series
        the ordinal or scale scores of the first variable
    field2 : pandas series
        the ordinal or scale scores of the second variable
    levels : list or dictionary, optional
        the categories to use
    dmu : float, optional 
        hypothesized difference. Default is zero
    appr : {"wilcoxon", "exact", "imanz", "imant"}, optional
        method to use for approximation. Default is "wilcoxon"
    noDiff : {"wilcoxon", "pratt", "zsplit"}, optional 
        method to deal with scores equal to mu. Default is "wilcoxon"
    ties : boolean, optional 
        to use a tie correction. Default is True
    cc : boolean, optional 
        use a continuity correction. Default is False
        
    Returns
    -------
    res : dataframe with 
    
    * "nr", the number of ranks used in calculation
    * "mu", the median according to the null hypothesis
    * "W", the Wilcoxon W value
    * "statistic", the test statistic
    * "df", degrees of freedom (only applicable for Iman t approximation)
    * "p-value", significance (p-value)
    * "test", description of the test used
    
    Notes
    -----
    The formula used (Wilcoxon, 1945):
    $$d_{i,j} = x_{i,j} - y_{i,j}$$
    
    These differences are then passed on to ts_wilcoxon_os().
    
    *Symbols used*
    
    * \\(x_i\\), is the i-th score from the first variable
    * \\(y_i\\), is the i-th score from the second variable
    
    References
    ----------
    Wilcoxon, F. (1945). Individual comparisons by ranking methods. *Biometrics Bulletin, 1*(6), 80. doi:10.2307/3001968
    
    Author
    ------
    Made by P. Stikker
    
    Companion website: https://PeterStatistics.com  
    YouTube channel: https://www.youtube.com/stikpet  
    Donations: https://www.patreon.com/bePatron?u=19398076   
    
    
    '''
    
    if type(field1) == list:
        field1 = pd.Series(field1)
        
    if type(field2) == list:
        field2 = pd.Series(field2)
    
    data = pd.concat([field1, field2], axis=1)
    data.columns = ["field1", "field2"]
    #Remove rows with missing values and reset index
    data = data.dropna()    
    data.reset_index()
    
    if levels is not None:
        data["field1"] = data["field1"].replace(levels)
        data["field1"]  = pd.to_numeric(data["field1"] )
        data["field2"] = data["field2"].replace(levels)
        data["field2"]  = pd.to_numeric(data["field2"] )
    else:
        data = pd.to_numeric(data)
    
    data["diff"] = data["field1"]  - data["field2"]
    
    res = ts_wilcoxon_os(data["diff"], mu=dmu, ties=ties, appr=appr, eqMed=noDiff, cc=cc)
    res.iloc[0, 6]=res["test"][0].replace("one-sample", "paired samples")
    
    
    return res

Functions

def ts_wilcoxon_ps(field1, field2, levels=None, dmu=0, appr='wilcoxon', noDiff='wilcoxon', ties=True, cc=False)

Wilcoxon Signed Rank Test (Paired Samples)

The paired-sample Wilcoxon signed rank test is often considered the non-parametric version of a paired-samples t-test. It can be used to determine if the median is significantly different between the two variables. It actually doesn't always tests this specifically, but more if the mean rank is significantly different.

If the p-value is the probability of a result as in the sample, or more extreme, if the assumption about the population would be true. If this is below a certain threshold (usually 0.05) the assumption about the population is rejected.

Results in software packages for this test can vary, since there are a few different approaches. Especially if there are so-called ties.

This function simply determines the differences between the two provided variables, and then passes these differences along to the one-sample version. See ts_wilcoxon_os() for details on this.

Parameters

field1 : pandas series: the ordinal or scale scores of the first variable
field2 : pandas series: the ordinal or scale scores of the second variable
levels : list or dictionary, optional: the categories to use
dmu : float, optional: hypothesized difference. Default is zero
appr : {"wilcoxon", "exact", "imanz", "imant"}, optional: method to use for approximation. Default is "wilcoxon"
noDiff : {"wilcoxon", "pratt", "zsplit"}, optional: method to deal with scores equal to mu. Default is "wilcoxon"
ties : boolean, optional: to use a tie correction. Default is True
cc : boolean, optional: use a continuity correction. Default is False

Returns

res : dataframe with

"nr", the number of ranks used in calculation
"mu", the median according to the null hypothesis
"W", the Wilcoxon W value
"statistic", the test statistic
"df", degrees of freedom (only applicable for Iman t approximation)
"p-value", significance (p-value)
"test", description of the test used

Notes

The formula used (Wilcoxon, 1945): $d_{i,j} = x_{i,j} - y_{i,j}$

These differences are then passed on to ts_wilcoxon_os().

Symbols used

$x_i$ , is the i-th score from the first variable
$y_i$ , is the i-th score from the second variable

References

Wilcoxon, F. (1945). Individual comparisons by ranking methods. Biometrics Bulletin, 1(6), 80. doi:10.2307/3001968

Author

Made by P. Stikker

Companion website: https://PeterStatistics.com
YouTube channel: https://www.youtube.com/stikpet
Donations: https://www.patreon.com/bePatron?u=19398076

Expand source code

def ts_wilcoxon_ps(field1, field2, levels=None, dmu=0, 
                   appr = "wilcoxon", 
                   noDiff = "wilcoxon", 
                   ties = True, 
                   cc = False):
    '''
    Wilcoxon Signed Rank Test (Paired Samples)
    ------------------------------------------
    The paired-sample Wilcoxon signed rank test is often considered the non-parametric version of a paired-samples t-test. It can be used to determine if the median is significantly different between the two variables. It actually doesn't always tests this specifically, but more if the mean rank is significantly different.
    
    If the p-value is the probability of a result as in the sample, or more extreme, if the assumption about the population would be true. If this is below a certain threshold (usually 0.05) the assumption about the population is rejected. 
    
    Results in software packages for this test can vary, since there are a few different approaches. Especially if there are so-called ties.
    
    This function simply determines the differences between the two provided variables, and then passes these differences along to the one-sample version. See ts_wilcoxon_os() for details on this.
    
    Parameters
    ----------
    field1 : pandas series
        the ordinal or scale scores of the first variable
    field2 : pandas series
        the ordinal or scale scores of the second variable
    levels : list or dictionary, optional
        the categories to use
    dmu : float, optional 
        hypothesized difference. Default is zero
    appr : {"wilcoxon", "exact", "imanz", "imant"}, optional
        method to use for approximation. Default is "wilcoxon"
    noDiff : {"wilcoxon", "pratt", "zsplit"}, optional 
        method to deal with scores equal to mu. Default is "wilcoxon"
    ties : boolean, optional 
        to use a tie correction. Default is True
    cc : boolean, optional 
        use a continuity correction. Default is False
        
    Returns
    -------
    res : dataframe with 
    
    * "nr", the number of ranks used in calculation
    * "mu", the median according to the null hypothesis
    * "W", the Wilcoxon W value
    * "statistic", the test statistic
    * "df", degrees of freedom (only applicable for Iman t approximation)
    * "p-value", significance (p-value)
    * "test", description of the test used
    
    Notes
    -----
    The formula used (Wilcoxon, 1945):
    $$d_{i,j} = x_{i,j} - y_{i,j}$$
    
    These differences are then passed on to ts_wilcoxon_os().
    
    *Symbols used*
    
    * \\(x_i\\), is the i-th score from the first variable
    * \\(y_i\\), is the i-th score from the second variable
    
    References
    ----------
    Wilcoxon, F. (1945). Individual comparisons by ranking methods. *Biometrics Bulletin, 1*(6), 80. doi:10.2307/3001968
    
    Author
    ------
    Made by P. Stikker
    
    Companion website: https://PeterStatistics.com  
    YouTube channel: https://www.youtube.com/stikpet  
    Donations: https://www.patreon.com/bePatron?u=19398076   
    
    
    '''
    
    if type(field1) == list:
        field1 = pd.Series(field1)
        
    if type(field2) == list:
        field2 = pd.Series(field2)
    
    data = pd.concat([field1, field2], axis=1)
    data.columns = ["field1", "field2"]
    #Remove rows with missing values and reset index
    data = data.dropna()    
    data.reset_index()
    
    if levels is not None:
        data["field1"] = data["field1"].replace(levels)
        data["field1"]  = pd.to_numeric(data["field1"] )
        data["field2"] = data["field2"].replace(levels)
        data["field2"]  = pd.to_numeric(data["field2"] )
    else:
        data = pd.to_numeric(data)
    
    data["diff"] = data["field1"]  - data["field2"]
    
    res = ts_wilcoxon_os(data["diff"], mu=dmu, ties=ties, appr=appr, eqMed=noDiff, cc=cc)
    res.iloc[0, 6]=res["test"][0].replace("one-sample", "paired samples")
    
    
    return res