Module stikpetP.effect_sizes.eff_size_common_language_os

Expand source code
import pandas as pd
from statistics import NormalDist
from ..effect_sizes.eff_size_cohen_d_os import es_cohen_d_os
from ..correlations.cor_rank_biserial_os import r_rank_biserial_os

def es_common_language_os(scores, levels=None, mu=None, version="brute"):
    '''
    Common Language Effect Size (One-Sample)
    -----------------------------------------------
    The Common Language Effect Size is most often used for independent samples or paired samples, but some have adapted the concept for one-sample as well.
    
    It is the probability of taking a random score and the probability it is higher than the selected value: 
    $$P(X > \\mu_{H_0})$$
    
    Some will also argue to count ties equally, which makes the definition:
    $$P(X > \\mu_{H_0}) + \\frac{P(X = \\mu_{H_0})}{2}$$
    
    This version is implemented in MatLab (see <a href="https://nl.mathworks.com/matlabcentral/fileexchange/113020-cles">here</a>) based on a Python version from Tulimieri (2021) 
    
    For scale data, an approximation using the standard normal distribution is also available using Cohen's d, alternatively a conversion via the rank-biserial coefficient can be done. These two are used in R’s *effectsize* library from Ben-Shachar et al. (2020).

    This function is shown in this [YouTube video](https://youtu.be/S1zUOkWXg5A) and the measure is also described at [PeterStatistics.com](https://peterstatistics.com/Terms/EffectSizes/CommonLanguageEffectSize.html)

    Parameters
    ----------
    scores : dataframe or list
        the scores
    levels : list or dictionary, optional
        the scores in order
    mu : float, optional 
        test statistic (default is mid-range)
    method : {"brute", "brute-it", "rb", "normal"} : optional
        method to use. see details
        
    Returns
    -------
    CLES : float
        the Common Language Effect Size
    
    Notes
    ------
    For "brute" simply counts all scores above the test statistic and half of the ones that are equal (Tulimieri, 2021):
    $$CL = P(X > \\mu_{H_0}) + \\frac{P(X = \\mu_{H_0})}{2}$$

    With:
    $$P\\left(x \\gt \\mu\\right) = \\frac{\\sum_{i=1}^n \\begin{cases} 1, & \\text{if } x_i \\gt \\mu \\\\ 0, & \\text{otherwise}\\end{cases}}{n}$$
    $$P\\left(x = \\mu\\right) = \\frac{\\sum_{i=1}^n \\begin{cases} 1, & \\text{if } x_i = \\mu \\\\ 0, & \\text{otherwise}\\end{cases}}{n}$$

    This seems to also produce the same result as what Mangiafico (2016, pp. 223–224) calls a VDA-like measure, where VDA is short for Vargha-Delaney A.

    With "brute-it" the ties are ignored (it = ignore ties):
    $$CL = P(X > \\mu_{H_0})$$

    The "normal", uses Cohen's d and a normal approximation (Ben-Shachar et al., 2020):
    $$CL = \\Phi\\left(\\frac{d'}{\\sqrt{2}}\\right)$$

    Where $d'$ is Cohen's d for one-sample, and $\\Phi\\left(\\dots\\right)$ the cumulative density function of the normal distribution
    This is like a one-sample version of the McGraw and Wong (1992, p. 361) version with the independent samples.

    The "rb", uses the rank-biserial correlation coefficient (Ben-Shachar et al., 2020):
    $$CL = \\frac{1+r_b}{2}$$

    The CLE can be converted to a Rank Biserial (= Cliff delta) using the **es_convert()** function. This can then be converted to a Cohen d, and then the rules-of-thumb for Cohen d could be used (**th_cohen_d()**)
    

    Before, After and Alternatives
    ------------------------------
    Before this measure you might want to perform the test:
    * [ts_sign_os](../tests/test_sign_os.html#ts_sign_os) for One-Sample Sign Test
    * [ts_trinomial_os](../tests/test_trinomial_os.html#ts_trinomial_os) for One-Sample Trinomial Test
    * [ts_wilcoxon_os](../tests/test_wilcoxon_os.html#ts_wilcoxon_os) for Wilcoxon Signed Rank Test (One-Sample)
    * [ts_student_t_os](../tests/test_student_t_os.html#ts_student_t_os) for One-Sample Student t-Test
    * [ts_trimmed_mean_os](../tests/test_trimmed_mean_os.html#ts_trimmed_mean_os) for One-Sample Trimmed (Yuen or Yuen-Welch) Mean Test
    * [ts_z_os](../tests/test_z_os.html#ts_z_os) for One-Sample Z Test

    After this you might want a rule-of-thumb directly or by converting this to either rank biserial or Cohen d:
    * [th_cle](../other/thumb_cle.html#th_cle) for CLES rule-of-thumb (incl. conversion options)
    
    Alternative effect size measure with ordinal data:
    * [es_dominance](../effect_sizes/eff_size_dominance.html#es_dominance) for the Dominance score
    * [r_rank_biserial_os](../correlations/cor_rank_biserial_os.html#r_rank_biserial_os) for the Rank-Biserial Correlation
    * [r_rosenthal](../correlations/cor_rosenthal.html#r_rosenthal) for the Rosenthal Correlation if a z-value is available

    Alternative effect size measure with interval or ratio data:
    * [es_cohen_d_os](../effect_sizes/eff_size_cohen_d_os.html#es_cohen_d_os) for Cohen d'
    * [es_hedges_g_os](../effect_sizes/eff_size_hedges_g_os.html#es_hedges_g_os) for Hedges g
    
    References
    ----------
    Ben-Shachar, M., Lüdecke, D., & Makowski, D. (2020). effectsize: Estimation of Effect Size Indices and Standardized Parameters. *Journal of Open Source Software, 5*(56), 1–7. doi:10.21105/joss.02815
    
    Grissom, R. J. (1994). Statistical analysis of ordinal categorical status after therapies. *Journal of Consulting and Clinical Psychology, 62*(2), 281–284. doi:10.1037/0022-006X.62.2.281

    Mangiafico, S. S. (2016). *Summary and analysis of extension program evaluation in R* (1.20.01). Rutger Cooperative Extension.
    
    McGraw, K. O., & Wong, S. P. (1992). A common language effect size statistic. *Psychological Bulletin, 111*(2), 361–365. doi:10.1037/0033-2909.111.2.361

    Tulimieri, D. (2021). CLES/CLES. https://github.com/tulimid1/CLES/tree/main
    
    Wolfe, D. A., & Hogg, R. V. (1971). On constructing statistics and reporting data. *The American Statistician, 25*(4), 27–30. doi:10.1080/00031305.1971.10477278
    
    Author
    ------
    Made by P. Stikker
    
    Companion website: https://PeterStatistics.com  
    YouTube channel: https://www.youtube.com/stikpet  
    Donations: https://www.patreon.com/bePatron?u=19398076

    Examples
    --------
    Example 1: Text Pandas Series
    >>> import pandas as pd
    >>> student_df = pd.read_csv('https://peterstatistics.com/Packages/ExampleData/StudentStatistics.csv', sep=';', low_memory=False, storage_options={'User-Agent': 'Mozilla/5.0'})
    >>> ex1 = student_df['Teach_Motivate']
    >>> order = {"Fully Disagree":1, "Disagree":2, "Neither disagree nor agree":3, "Agree":4, "Fully agree":5}
    >>> es_common_language_os(ex1, levels=order)
    0.35185185185185186
    
    Example 2: Numeric data
    >>> ex2 = [1, 1, 1, 2, 2, 2, 3, 3, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5]
    >>> es_common_language_os(ex2)
    0.6111111111111112
    
    Example 3: Text data with
    >>> ex3 = ["a", "b", "f", "d", "e", "c"]
    >>> order = {"a":1, "b":2, "c":3, "d":4, "e":5, "f":6}
    >>> es_common_language_os(ex3, levels=order)
    0.5
    
    '''
    if type(scores) is list:
        scores = pd.Series(scores)
    
    #remove missing values
    scores = scores.dropna()
    if levels is not None:
        scores = scores.map(levels).astype('Int8')
    else:
        scores = pd.to_numeric(scores)
    
    scores = scores.sort_values()
    n = len(scores)

    #set hypothesized median to mid range if not provided
    if mu is None:
        mu = (min(scores) + max(scores)) / 2
    
    if version=="brute-it":
        n_gt = sum([1 for i in scores if i > mu])
        cles = n_gt / n
    elif version=="brute":
        n_gt = sum([1 for i in scores if i > mu])
        n_eq = sum([1 for i in scores if i == mu])
        cles = n_gt / n + 1/2*(n_eq/n)
    elif version=="rb":
        rb = r_rank_biserial_os(scores, mu=mu).iloc[0,1]
        cles = (1 + rb)/2
    elif version=="normal":
        d_os = es_cohen_d_os(scores, mu=mu)
        cles = NormalDist().cdf(d_os/(2**0.5))
    return cles

Functions

def es_common_language_os(scores, levels=None, mu=None, version='brute')

Common Language Effect Size (One-Sample)

The Common Language Effect Size is most often used for independent samples or paired samples, but some have adapted the concept for one-sample as well.

It is the probability of taking a random score and the probability it is higher than the selected value: P(X > \mu_{H_0})

Some will also argue to count ties equally, which makes the definition: P(X > \mu_{H_0}) + \frac{P(X = \mu_{H_0})}{2}

This version is implemented in MatLab (see here) based on a Python version from Tulimieri (2021)

For scale data, an approximation using the standard normal distribution is also available using Cohen's d, alternatively a conversion via the rank-biserial coefficient can be done. These two are used in R’s effectsize library from Ben-Shachar et al. (2020).

This function is shown in this YouTube video and the measure is also described at PeterStatistics.com

Parameters

scores : dataframe or list
the scores
levels : list or dictionary, optional
the scores in order
mu : float, optional
test statistic (default is mid-range)
method : {"brute", "brute-it", "rb", "normal"} : optional
method to use. see details

Returns

CLES : float
the Common Language Effect Size

Notes

For "brute" simply counts all scores above the test statistic and half of the ones that are equal (Tulimieri, 2021): CL = P(X > \mu_{H_0}) + \frac{P(X = \mu_{H_0})}{2}

With: P\left(x \gt \mu\right) = \frac{\sum_{i=1}^n \begin{cases} 1, & \text{if } x_i \gt \mu \\ 0, & \text{otherwise}\end{cases}}{n} P\left(x = \mu\right) = \frac{\sum_{i=1}^n \begin{cases} 1, & \text{if } x_i = \mu \\ 0, & \text{otherwise}\end{cases}}{n}

This seems to also produce the same result as what Mangiafico (2016, pp. 223–224) calls a VDA-like measure, where VDA is short for Vargha-Delaney A.

With "brute-it" the ties are ignored (it = ignore ties): CL = P(X > \mu_{H_0})

The "normal", uses Cohen's d and a normal approximation (Ben-Shachar et al., 2020): CL = \Phi\left(\frac{d'}{\sqrt{2}}\right)

Where $d'$ is Cohen's d for one-sample, and $\Phi\left(\dots\right)$ the cumulative density function of the normal distribution This is like a one-sample version of the McGraw and Wong (1992, p. 361) version with the independent samples.

The "rb", uses the rank-biserial correlation coefficient (Ben-Shachar et al., 2020): CL = \frac{1+r_b}{2}

The CLE can be converted to a Rank Biserial (= Cliff delta) using the es_convert() function. This can then be converted to a Cohen d, and then the rules-of-thumb for Cohen d could be used (th_cohen_d())

Before, After and Alternatives

Before this measure you might want to perform the test: * ts_sign_os for One-Sample Sign Test * ts_trinomial_os for One-Sample Trinomial Test * ts_wilcoxon_os for Wilcoxon Signed Rank Test (One-Sample) * ts_student_t_os for One-Sample Student t-Test * ts_trimmed_mean_os for One-Sample Trimmed (Yuen or Yuen-Welch) Mean Test * ts_z_os for One-Sample Z Test

After this you might want a rule-of-thumb directly or by converting this to either rank biserial or Cohen d: * th_cle for CLES rule-of-thumb (incl. conversion options)

Alternative effect size measure with ordinal data: * es_dominance for the Dominance score * r_rank_biserial_os for the Rank-Biserial Correlation * r_rosenthal for the Rosenthal Correlation if a z-value is available

Alternative effect size measure with interval or ratio data: * es_cohen_d_os for Cohen d' * es_hedges_g_os for Hedges g

References

Ben-Shachar, M., Lüdecke, D., & Makowski, D. (2020). effectsize: Estimation of Effect Size Indices and Standardized Parameters. Journal of Open Source Software, 5(56), 1–7. doi:10.21105/joss.02815

Grissom, R. J. (1994). Statistical analysis of ordinal categorical status after therapies. Journal of Consulting and Clinical Psychology, 62(2), 281–284. doi:10.1037/0022-006X.62.2.281

Mangiafico, S. S. (2016). Summary and analysis of extension program evaluation in R (1.20.01). Rutger Cooperative Extension.

McGraw, K. O., & Wong, S. P. (1992). A common language effect size statistic. Psychological Bulletin, 111(2), 361–365. doi:10.1037/0033-2909.111.2.361

Tulimieri, D. (2021). CLES/CLES. https://github.com/tulimid1/CLES/tree/main

Wolfe, D. A., & Hogg, R. V. (1971). On constructing statistics and reporting data. The American Statistician, 25(4), 27–30. doi:10.1080/00031305.1971.10477278

Author

Made by P. Stikker

Companion website: https://PeterStatistics.com
YouTube channel: https://www.youtube.com/stikpet
Donations: https://www.patreon.com/bePatron?u=19398076

Examples

Example 1: Text Pandas Series

>>> import pandas as pd
>>> student_df = pd.read_csv('https://peterstatistics.com/Packages/ExampleData/StudentStatistics.csv', sep=';', low_memory=False, storage_options={'User-Agent': 'Mozilla/5.0'})
>>> ex1 = student_df['Teach_Motivate']
>>> order = {"Fully Disagree":1, "Disagree":2, "Neither disagree nor agree":3, "Agree":4, "Fully agree":5}
>>> es_common_language_os(ex1, levels=order)
0.35185185185185186

Example 2: Numeric data

>>> ex2 = [1, 1, 1, 2, 2, 2, 3, 3, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5]
>>> es_common_language_os(ex2)
0.6111111111111112

Example 3: Text data with

>>> ex3 = ["a", "b", "f", "d", "e", "c"]
>>> order = {"a":1, "b":2, "c":3, "d":4, "e":5, "f":6}
>>> es_common_language_os(ex3, levels=order)
0.5
Expand source code
def es_common_language_os(scores, levels=None, mu=None, version="brute"):
    '''
    Common Language Effect Size (One-Sample)
    -----------------------------------------------
    The Common Language Effect Size is most often used for independent samples or paired samples, but some have adapted the concept for one-sample as well.
    
    It is the probability of taking a random score and the probability it is higher than the selected value: 
    $$P(X > \\mu_{H_0})$$
    
    Some will also argue to count ties equally, which makes the definition:
    $$P(X > \\mu_{H_0}) + \\frac{P(X = \\mu_{H_0})}{2}$$
    
    This version is implemented in MatLab (see <a href="https://nl.mathworks.com/matlabcentral/fileexchange/113020-cles">here</a>) based on a Python version from Tulimieri (2021) 
    
    For scale data, an approximation using the standard normal distribution is also available using Cohen's d, alternatively a conversion via the rank-biserial coefficient can be done. These two are used in R’s *effectsize* library from Ben-Shachar et al. (2020).

    This function is shown in this [YouTube video](https://youtu.be/S1zUOkWXg5A) and the measure is also described at [PeterStatistics.com](https://peterstatistics.com/Terms/EffectSizes/CommonLanguageEffectSize.html)

    Parameters
    ----------
    scores : dataframe or list
        the scores
    levels : list or dictionary, optional
        the scores in order
    mu : float, optional 
        test statistic (default is mid-range)
    method : {"brute", "brute-it", "rb", "normal"} : optional
        method to use. see details
        
    Returns
    -------
    CLES : float
        the Common Language Effect Size
    
    Notes
    ------
    For "brute" simply counts all scores above the test statistic and half of the ones that are equal (Tulimieri, 2021):
    $$CL = P(X > \\mu_{H_0}) + \\frac{P(X = \\mu_{H_0})}{2}$$

    With:
    $$P\\left(x \\gt \\mu\\right) = \\frac{\\sum_{i=1}^n \\begin{cases} 1, & \\text{if } x_i \\gt \\mu \\\\ 0, & \\text{otherwise}\\end{cases}}{n}$$
    $$P\\left(x = \\mu\\right) = \\frac{\\sum_{i=1}^n \\begin{cases} 1, & \\text{if } x_i = \\mu \\\\ 0, & \\text{otherwise}\\end{cases}}{n}$$

    This seems to also produce the same result as what Mangiafico (2016, pp. 223–224) calls a VDA-like measure, where VDA is short for Vargha-Delaney A.

    With "brute-it" the ties are ignored (it = ignore ties):
    $$CL = P(X > \\mu_{H_0})$$

    The "normal", uses Cohen's d and a normal approximation (Ben-Shachar et al., 2020):
    $$CL = \\Phi\\left(\\frac{d'}{\\sqrt{2}}\\right)$$

    Where $d'$ is Cohen's d for one-sample, and $\\Phi\\left(\\dots\\right)$ the cumulative density function of the normal distribution
    This is like a one-sample version of the McGraw and Wong (1992, p. 361) version with the independent samples.

    The "rb", uses the rank-biserial correlation coefficient (Ben-Shachar et al., 2020):
    $$CL = \\frac{1+r_b}{2}$$

    The CLE can be converted to a Rank Biserial (= Cliff delta) using the **es_convert()** function. This can then be converted to a Cohen d, and then the rules-of-thumb for Cohen d could be used (**th_cohen_d()**)
    

    Before, After and Alternatives
    ------------------------------
    Before this measure you might want to perform the test:
    * [ts_sign_os](../tests/test_sign_os.html#ts_sign_os) for One-Sample Sign Test
    * [ts_trinomial_os](../tests/test_trinomial_os.html#ts_trinomial_os) for One-Sample Trinomial Test
    * [ts_wilcoxon_os](../tests/test_wilcoxon_os.html#ts_wilcoxon_os) for Wilcoxon Signed Rank Test (One-Sample)
    * [ts_student_t_os](../tests/test_student_t_os.html#ts_student_t_os) for One-Sample Student t-Test
    * [ts_trimmed_mean_os](../tests/test_trimmed_mean_os.html#ts_trimmed_mean_os) for One-Sample Trimmed (Yuen or Yuen-Welch) Mean Test
    * [ts_z_os](../tests/test_z_os.html#ts_z_os) for One-Sample Z Test

    After this you might want a rule-of-thumb directly or by converting this to either rank biserial or Cohen d:
    * [th_cle](../other/thumb_cle.html#th_cle) for CLES rule-of-thumb (incl. conversion options)
    
    Alternative effect size measure with ordinal data:
    * [es_dominance](../effect_sizes/eff_size_dominance.html#es_dominance) for the Dominance score
    * [r_rank_biserial_os](../correlations/cor_rank_biserial_os.html#r_rank_biserial_os) for the Rank-Biserial Correlation
    * [r_rosenthal](../correlations/cor_rosenthal.html#r_rosenthal) for the Rosenthal Correlation if a z-value is available

    Alternative effect size measure with interval or ratio data:
    * [es_cohen_d_os](../effect_sizes/eff_size_cohen_d_os.html#es_cohen_d_os) for Cohen d'
    * [es_hedges_g_os](../effect_sizes/eff_size_hedges_g_os.html#es_hedges_g_os) for Hedges g
    
    References
    ----------
    Ben-Shachar, M., Lüdecke, D., & Makowski, D. (2020). effectsize: Estimation of Effect Size Indices and Standardized Parameters. *Journal of Open Source Software, 5*(56), 1–7. doi:10.21105/joss.02815
    
    Grissom, R. J. (1994). Statistical analysis of ordinal categorical status after therapies. *Journal of Consulting and Clinical Psychology, 62*(2), 281–284. doi:10.1037/0022-006X.62.2.281

    Mangiafico, S. S. (2016). *Summary and analysis of extension program evaluation in R* (1.20.01). Rutger Cooperative Extension.
    
    McGraw, K. O., & Wong, S. P. (1992). A common language effect size statistic. *Psychological Bulletin, 111*(2), 361–365. doi:10.1037/0033-2909.111.2.361

    Tulimieri, D. (2021). CLES/CLES. https://github.com/tulimid1/CLES/tree/main
    
    Wolfe, D. A., & Hogg, R. V. (1971). On constructing statistics and reporting data. *The American Statistician, 25*(4), 27–30. doi:10.1080/00031305.1971.10477278
    
    Author
    ------
    Made by P. Stikker
    
    Companion website: https://PeterStatistics.com  
    YouTube channel: https://www.youtube.com/stikpet  
    Donations: https://www.patreon.com/bePatron?u=19398076

    Examples
    --------
    Example 1: Text Pandas Series
    >>> import pandas as pd
    >>> student_df = pd.read_csv('https://peterstatistics.com/Packages/ExampleData/StudentStatistics.csv', sep=';', low_memory=False, storage_options={'User-Agent': 'Mozilla/5.0'})
    >>> ex1 = student_df['Teach_Motivate']
    >>> order = {"Fully Disagree":1, "Disagree":2, "Neither disagree nor agree":3, "Agree":4, "Fully agree":5}
    >>> es_common_language_os(ex1, levels=order)
    0.35185185185185186
    
    Example 2: Numeric data
    >>> ex2 = [1, 1, 1, 2, 2, 2, 3, 3, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5]
    >>> es_common_language_os(ex2)
    0.6111111111111112
    
    Example 3: Text data with
    >>> ex3 = ["a", "b", "f", "d", "e", "c"]
    >>> order = {"a":1, "b":2, "c":3, "d":4, "e":5, "f":6}
    >>> es_common_language_os(ex3, levels=order)
    0.5
    
    '''
    if type(scores) is list:
        scores = pd.Series(scores)
    
    #remove missing values
    scores = scores.dropna()
    if levels is not None:
        scores = scores.map(levels).astype('Int8')
    else:
        scores = pd.to_numeric(scores)
    
    scores = scores.sort_values()
    n = len(scores)

    #set hypothesized median to mid range if not provided
    if mu is None:
        mu = (min(scores) + max(scores)) / 2
    
    if version=="brute-it":
        n_gt = sum([1 for i in scores if i > mu])
        cles = n_gt / n
    elif version=="brute":
        n_gt = sum([1 for i in scores if i > mu])
        n_eq = sum([1 for i in scores if i == mu])
        cles = n_gt / n + 1/2*(n_eq/n)
    elif version=="rb":
        rb = r_rank_biserial_os(scores, mu=mu).iloc[0,1]
        cles = (1 + rb)/2
    elif version=="normal":
        d_os = es_cohen_d_os(scores, mu=mu)
        cles = NormalDist().cdf(d_os/(2**0.5))
    return cles