Module stikpetP.effect_sizes.eff_size_bag_s
Expand source code
import pandas as pd
from ..other.table_cross import tab_cross
def es_bag_s(field1, field2, categories=None):
'''
Bennett-Alpert-Goldstein S
--------------------------
An effect size meaure, that measures the how strongly two raters or variables, agree with each other.
It takes the proportions of cases that both agree, and adjusts for the number of categories. Scott's pi (see es_scott_pi()) does this as well, and improves on this measure.
Parameters
----------
field1 : list or pandas series
the first categorical field
field2 : list or pandas series
the first categorical field
categories : list or dictionary, optional
order and/or selection for categories of field1 and field2
Returns
-------
S : float, the Bennett-Alpert-Goldstein value
Notes
-----
The formula used (Bennett et al., 1954, p. 307):
$$S = \\frac{k}{k-1}\\times\\left(p_0 - \\frac{1}{k}\\right)$$
With:
$$P = \\sum_{i=1}^r F_{i,i}$$
$$p_0 = \\frac{P}{n}$$
*Symbols used*
* \\(F_{i,j}\\), the observed count in row i and column j.
* \\(r\\), is the number of rows (categories in the first variable)
* \\(n\\), is the total number of scores
References
----------
Bennett, E. M., Alpert, R., & Goldstein, A. C. (1954). Communications through limited response questioning. *Public Opinion Quarterly, 18*(3), 303. doi:10.1086/266520
Author
------
Made by P. Stikker
Companion website: https://PeterStatistics.com
YouTube channel: https://www.youtube.com/stikpet
Donations: https://www.patreon.com/bePatron?u=19398076
'''
#create the cross table
ct = tab_cross(field1, field2, categories, categories, totals="include")
#basic counts
k = ct.shape[0]-1
n = ct.iloc[k, k]
#STEP 1: determine p0
p0 = 0
for i in range(0, k):
p0 = p0 + ct.iloc[i, i]
p0 = p0/n
S = k / (k - 1) * (p0 - 1 / k)
return (S)
Functions
def es_bag_s(field1, field2, categories=None)
-
Bennett-Alpert-Goldstein S
An effect size meaure, that measures the how strongly two raters or variables, agree with each other.
It takes the proportions of cases that both agree, and adjusts for the number of categories. Scott's pi (see es_scott_pi()) does this as well, and improves on this measure.
Parameters
field1
:list
orpandas series
- the first categorical field
field2
:list
orpandas series
- the first categorical field
categories
:list
ordictionary
, optional- order and/or selection for categories of field1 and field2
Returns
S
:float, the Bennett-Alpert-Goldstein value
Notes
The formula used (Bennett et al., 1954, p. 307): S = \frac{k}{k-1}\times\left(p_0 - \frac{1}{k}\right)
With: P = \sum_{i=1}^r F_{i,i} p_0 = \frac{P}{n}
Symbols used
- F_{i,j}, the observed count in row i and column j.
- r, is the number of rows (categories in the first variable)
- n, is the total number of scores
References
Bennett, E. M., Alpert, R., & Goldstein, A. C. (1954). Communications through limited response questioning. Public Opinion Quarterly, 18(3), 303. doi:10.1086/266520
Author
Made by P. Stikker
Companion website: https://PeterStatistics.com
YouTube channel: https://www.youtube.com/stikpet
Donations: https://www.patreon.com/bePatron?u=19398076Expand source code
def es_bag_s(field1, field2, categories=None): ''' Bennett-Alpert-Goldstein S -------------------------- An effect size meaure, that measures the how strongly two raters or variables, agree with each other. It takes the proportions of cases that both agree, and adjusts for the number of categories. Scott's pi (see es_scott_pi()) does this as well, and improves on this measure. Parameters ---------- field1 : list or pandas series the first categorical field field2 : list or pandas series the first categorical field categories : list or dictionary, optional order and/or selection for categories of field1 and field2 Returns ------- S : float, the Bennett-Alpert-Goldstein value Notes ----- The formula used (Bennett et al., 1954, p. 307): $$S = \\frac{k}{k-1}\\times\\left(p_0 - \\frac{1}{k}\\right)$$ With: $$P = \\sum_{i=1}^r F_{i,i}$$ $$p_0 = \\frac{P}{n}$$ *Symbols used* * \\(F_{i,j}\\), the observed count in row i and column j. * \\(r\\), is the number of rows (categories in the first variable) * \\(n\\), is the total number of scores References ---------- Bennett, E. M., Alpert, R., & Goldstein, A. C. (1954). Communications through limited response questioning. *Public Opinion Quarterly, 18*(3), 303. doi:10.1086/266520 Author ------ Made by P. Stikker Companion website: https://PeterStatistics.com YouTube channel: https://www.youtube.com/stikpet Donations: https://www.patreon.com/bePatron?u=19398076 ''' #create the cross table ct = tab_cross(field1, field2, categories, categories, totals="include") #basic counts k = ct.shape[0]-1 n = ct.iloc[k, k] #STEP 1: determine p0 p0 = 0 for i in range(0, k): p0 = p0 + ct.iloc[i, i] p0 = p0/n S = k / (k - 1) * (p0 - 1 / k) return (S)