Module stikpetP.tests.test_wilcoxon_ps
Expand source code
import pandas as pd
from ..tests.test_wilcoxon_os import ts_wilcoxon_os
def ts_wilcoxon_ps(field1, field2, levels=None, dmu=0,
appr = "wilcoxon",
noDiff = "wilcoxon",
ties = True,
cc = False):
'''
Wilcoxon Signed Rank Test (Paired Samples)
------------------------------------------
The paired-sample Wilcoxon signed rank test is often considered the non-parametric version of a paired-samples t-test. It can be used to determine if the median is significantly different between the two variables. It actually doesn't always tests this specifically, but more if the mean rank is significantly different.
If the p-value is the probability of a result as in the sample, or more extreme, if the assumption about the population would be true. If this is below a certain threshold (usually 0.05) the assumption about the population is rejected.
Results in software packages for this test can vary, since there are a few different approaches. Especially if there are so-called ties.
This function simply determines the differences between the two provided variables, and then passes these differences along to the one-sample version. See ts_wilcoxon_os() for details on this.
Parameters
----------
field1 : pandas series
the ordinal or scale scores of the first variable
field2 : pandas series
the ordinal or scale scores of the second variable
levels : list or dictionary, optional
the categories to use
dmu : float, optional
hypothesized difference. Default is zero
appr : {"wilcoxon", "exact", "imanz", "imant"}, optional
method to use for approximation. Default is "wilcoxon"
noDiff : {"wilcoxon", "pratt", "zsplit"}, optional
method to deal with scores equal to mu. Default is "wilcoxon"
ties : boolean, optional
to use a tie correction. Default is True
cc : boolean, optional
use a continuity correction. Default is False
Returns
-------
res : dataframe with
* "nr", the number of ranks used in calculation
* "mu", the median according to the null hypothesis
* "W", the Wilcoxon W value
* "statistic", the test statistic
* "df", degrees of freedom (only applicable for Iman t approximation)
* "p-value", significance (p-value)
* "test", description of the test used
Notes
-----
The formula used (Wilcoxon, 1945):
$$d_{i,j} = x_{i,j} - y_{i,j}$$
These differences are then passed on to ts_wilcoxon_os().
*Symbols used*
* \\(x_i\\), is the i-th score from the first variable
* \\(y_i\\), is the i-th score from the second variable
References
----------
Wilcoxon, F. (1945). Individual comparisons by ranking methods. *Biometrics Bulletin, 1*(6), 80. doi:10.2307/3001968
Author
------
Made by P. Stikker
Companion website: https://PeterStatistics.com
YouTube channel: https://www.youtube.com/stikpet
Donations: https://www.patreon.com/bePatron?u=19398076
'''
if type(field1) == list:
field1 = pd.Series(field1)
if type(field2) == list:
field2 = pd.Series(field2)
data = pd.concat([field1, field2], axis=1)
data.columns = ["field1", "field2"]
#Remove rows with missing values and reset index
data = data.dropna()
data.reset_index()
if levels is not None:
data["field1"] = data["field1"].replace(levels)
data["field1"] = pd.to_numeric(data["field1"] )
data["field2"] = data["field2"].replace(levels)
data["field2"] = pd.to_numeric(data["field2"] )
else:
data = pd.to_numeric(data)
data["diff"] = data["field1"] - data["field2"]
res = ts_wilcoxon_os(data["diff"], mu=dmu, ties=ties, appr=appr, eqMed=noDiff, cc=cc)
res.iloc[0, 6]=res["test"][0].replace("one-sample", "paired samples")
return res
Functions
def ts_wilcoxon_ps(field1, field2, levels=None, dmu=0, appr='wilcoxon', noDiff='wilcoxon', ties=True, cc=False)-
Wilcoxon Signed Rank Test (Paired Samples)
The paired-sample Wilcoxon signed rank test is often considered the non-parametric version of a paired-samples t-test. It can be used to determine if the median is significantly different between the two variables. It actually doesn't always tests this specifically, but more if the mean rank is significantly different.
If the p-value is the probability of a result as in the sample, or more extreme, if the assumption about the population would be true. If this is below a certain threshold (usually 0.05) the assumption about the population is rejected.
Results in software packages for this test can vary, since there are a few different approaches. Especially if there are so-called ties.
This function simply determines the differences between the two provided variables, and then passes these differences along to the one-sample version. See ts_wilcoxon_os() for details on this.
Parameters
field1:pandas series- the ordinal or scale scores of the first variable
field2:pandas series- the ordinal or scale scores of the second variable
levels:listordictionary, optional- the categories to use
dmu:float, optional- hypothesized difference. Default is zero
appr:{"wilcoxon", "exact", "imanz", "imant"}, optional- method to use for approximation. Default is "wilcoxon"
noDiff:{"wilcoxon", "pratt", "zsplit"}, optional- method to deal with scores equal to mu. Default is "wilcoxon"
ties:boolean, optional- to use a tie correction. Default is True
cc:boolean, optional- use a continuity correction. Default is False
Returns
res:dataframe with
- "nr", the number of ranks used in calculation
- "mu", the median according to the null hypothesis
- "W", the Wilcoxon W value
- "statistic", the test statistic
- "df", degrees of freedom (only applicable for Iman t approximation)
- "p-value", significance (p-value)
- "test", description of the test used
Notes
The formula used (Wilcoxon, 1945): d_{i,j} = x_{i,j} - y_{i,j}
These differences are then passed on to ts_wilcoxon_os().
Symbols used
- x_i, is the i-th score from the first variable
- y_i, is the i-th score from the second variable
References
Wilcoxon, F. (1945). Individual comparisons by ranking methods. Biometrics Bulletin, 1(6), 80. doi:10.2307/3001968
Author
Made by P. Stikker
Companion website: https://PeterStatistics.com
YouTube channel: https://www.youtube.com/stikpet
Donations: https://www.patreon.com/bePatron?u=19398076Expand source code
def ts_wilcoxon_ps(field1, field2, levels=None, dmu=0, appr = "wilcoxon", noDiff = "wilcoxon", ties = True, cc = False): ''' Wilcoxon Signed Rank Test (Paired Samples) ------------------------------------------ The paired-sample Wilcoxon signed rank test is often considered the non-parametric version of a paired-samples t-test. It can be used to determine if the median is significantly different between the two variables. It actually doesn't always tests this specifically, but more if the mean rank is significantly different. If the p-value is the probability of a result as in the sample, or more extreme, if the assumption about the population would be true. If this is below a certain threshold (usually 0.05) the assumption about the population is rejected. Results in software packages for this test can vary, since there are a few different approaches. Especially if there are so-called ties. This function simply determines the differences between the two provided variables, and then passes these differences along to the one-sample version. See ts_wilcoxon_os() for details on this. Parameters ---------- field1 : pandas series the ordinal or scale scores of the first variable field2 : pandas series the ordinal or scale scores of the second variable levels : list or dictionary, optional the categories to use dmu : float, optional hypothesized difference. Default is zero appr : {"wilcoxon", "exact", "imanz", "imant"}, optional method to use for approximation. Default is "wilcoxon" noDiff : {"wilcoxon", "pratt", "zsplit"}, optional method to deal with scores equal to mu. Default is "wilcoxon" ties : boolean, optional to use a tie correction. Default is True cc : boolean, optional use a continuity correction. Default is False Returns ------- res : dataframe with * "nr", the number of ranks used in calculation * "mu", the median according to the null hypothesis * "W", the Wilcoxon W value * "statistic", the test statistic * "df", degrees of freedom (only applicable for Iman t approximation) * "p-value", significance (p-value) * "test", description of the test used Notes ----- The formula used (Wilcoxon, 1945): $$d_{i,j} = x_{i,j} - y_{i,j}$$ These differences are then passed on to ts_wilcoxon_os(). *Symbols used* * \\(x_i\\), is the i-th score from the first variable * \\(y_i\\), is the i-th score from the second variable References ---------- Wilcoxon, F. (1945). Individual comparisons by ranking methods. *Biometrics Bulletin, 1*(6), 80. doi:10.2307/3001968 Author ------ Made by P. Stikker Companion website: https://PeterStatistics.com YouTube channel: https://www.youtube.com/stikpet Donations: https://www.patreon.com/bePatron?u=19398076 ''' if type(field1) == list: field1 = pd.Series(field1) if type(field2) == list: field2 = pd.Series(field2) data = pd.concat([field1, field2], axis=1) data.columns = ["field1", "field2"] #Remove rows with missing values and reset index data = data.dropna() data.reset_index() if levels is not None: data["field1"] = data["field1"].replace(levels) data["field1"] = pd.to_numeric(data["field1"] ) data["field2"] = data["field2"].replace(levels) data["field2"] = pd.to_numeric(data["field2"] ) else: data = pd.to_numeric(data) data["diff"] = data["field1"] - data["field2"] res = ts_wilcoxon_os(data["diff"], mu=dmu, ties=ties, appr=appr, eqMed=noDiff, cc=cc) res.iloc[0, 6]=res["test"][0].replace("one-sample", "paired samples") return res