Module stikpetP.effect_sizes.eff_size_bag_s

Expand source code
import pandas as pd
from ..other.table_cross import tab_cross

def es_bag_s(field1, field2, categories=None):
    '''
    Bennett-Alpert-Goldstein S
    --------------------------
    An effect size meaure, that measures the how strongly two raters or variables, agree with each other. 
    
    It takes the proportions of cases that both agree, and adjusts for the number of categories. Scott's pi (see es_scott_pi()) does this as well, and improves on this measure.

    Parameters
    ----------
    field1 : list or pandas series
        the first categorical field
    field2 : list or pandas series
        the first categorical field
    categories : list or dictionary, optional
        order and/or selection for categories of field1 and field2
        
    Returns
    -------
    S : float, the Bennett-Alpert-Goldstein value
    
    Notes
    -----
    The formula used (Bennett et al., 1954, p. 307):
    $$S = \\frac{k}{k-1}\\times\\left(p_0 - \\frac{1}{k}\\right)$$
    
    With:
    $$P = \\sum_{i=1}^r F_{i,i}$$
    $$p_0 = \\frac{P}{n}$$
    
    *Symbols used*
    
    * \\(F_{i,j}\\), the observed count in row i and column j.
    * \\(r\\), is the number of rows (categories in the first variable)
    * \\(n\\), is the total number of scores
    
    References
    ----------
    Bennett, E. M., Alpert, R., & Goldstein, A. C. (1954). Communications through limited response questioning. *Public Opinion Quarterly, 18*(3), 303. doi:10.1086/266520
    
    Author
    ------
    Made by P. Stikker
    
    Companion website: https://PeterStatistics.com  
    YouTube channel: https://www.youtube.com/stikpet  
    Donations: https://www.patreon.com/bePatron?u=19398076
    
    '''
    
    #create the cross table
    ct = tab_cross(field1, field2, categories, categories, totals="include")    
    
    #basic counts
    k = ct.shape[0]-1
    n = ct.iloc[k, k]
    
    #STEP 1: determine p0
    p0 = 0
    for i in range(0, k):
        p0 = p0 + ct.iloc[i, i]
    p0 = p0/n
    
    S = k / (k - 1) * (p0 - 1 / k)
    
    return (S)

Functions

def es_bag_s(field1, field2, categories=None)

Bennett-Alpert-Goldstein S

An effect size meaure, that measures the how strongly two raters or variables, agree with each other.

It takes the proportions of cases that both agree, and adjusts for the number of categories. Scott's pi (see es_scott_pi()) does this as well, and improves on this measure.

Parameters

field1 : list or pandas series
the first categorical field
field2 : list or pandas series
the first categorical field
categories : list or dictionary, optional
order and/or selection for categories of field1 and field2

Returns

S : float, the Bennett-Alpert-Goldstein value
 

Notes

The formula used (Bennett et al., 1954, p. 307): S = \frac{k}{k-1}\times\left(p_0 - \frac{1}{k}\right)

With: P = \sum_{i=1}^r F_{i,i} p_0 = \frac{P}{n}

Symbols used

  • F_{i,j}, the observed count in row i and column j.
  • r, is the number of rows (categories in the first variable)
  • n, is the total number of scores

References

Bennett, E. M., Alpert, R., & Goldstein, A. C. (1954). Communications through limited response questioning. Public Opinion Quarterly, 18(3), 303. doi:10.1086/266520

Author

Made by P. Stikker

Companion website: https://PeterStatistics.com
YouTube channel: https://www.youtube.com/stikpet
Donations: https://www.patreon.com/bePatron?u=19398076

Expand source code
def es_bag_s(field1, field2, categories=None):
    '''
    Bennett-Alpert-Goldstein S
    --------------------------
    An effect size meaure, that measures the how strongly two raters or variables, agree with each other. 
    
    It takes the proportions of cases that both agree, and adjusts for the number of categories. Scott's pi (see es_scott_pi()) does this as well, and improves on this measure.

    Parameters
    ----------
    field1 : list or pandas series
        the first categorical field
    field2 : list or pandas series
        the first categorical field
    categories : list or dictionary, optional
        order and/or selection for categories of field1 and field2
        
    Returns
    -------
    S : float, the Bennett-Alpert-Goldstein value
    
    Notes
    -----
    The formula used (Bennett et al., 1954, p. 307):
    $$S = \\frac{k}{k-1}\\times\\left(p_0 - \\frac{1}{k}\\right)$$
    
    With:
    $$P = \\sum_{i=1}^r F_{i,i}$$
    $$p_0 = \\frac{P}{n}$$
    
    *Symbols used*
    
    * \\(F_{i,j}\\), the observed count in row i and column j.
    * \\(r\\), is the number of rows (categories in the first variable)
    * \\(n\\), is the total number of scores
    
    References
    ----------
    Bennett, E. M., Alpert, R., & Goldstein, A. C. (1954). Communications through limited response questioning. *Public Opinion Quarterly, 18*(3), 303. doi:10.1086/266520
    
    Author
    ------
    Made by P. Stikker
    
    Companion website: https://PeterStatistics.com  
    YouTube channel: https://www.youtube.com/stikpet  
    Donations: https://www.patreon.com/bePatron?u=19398076
    
    '''
    
    #create the cross table
    ct = tab_cross(field1, field2, categories, categories, totals="include")    
    
    #basic counts
    k = ct.shape[0]-1
    n = ct.iloc[k, k]
    
    #STEP 1: determine p0
    p0 = 0
    for i in range(0, k):
        p0 = p0 + ct.iloc[i, i]
    p0 = p0/n
    
    S = k / (k - 1) * (p0 - 1 / k)
    
    return (S)