Module stikpetP.effect_sizes.eff_size_glass_delta
Expand source code
from statistics import mean, variance
import pandas as pd
def es_glass_delta(catField, scaleField, categories=None, dmu=0, control=None):
'''
Glass Delta
-----------
An effect size measure when comparing two means, with a specified control group.
The measure is also described at [PeterStatistics.com](https://peterstatistics.com/Terms/EffectSizes/GlassDelta.html)
Parameters
----------
catField : dataframe or list
the categorical data
scaleField : dataframe or list
the scores
categories : list, optional
to indicate which two categories of catField to use, otherwise first two found will be used.
dmu : float, optional
difference according to null hypothesis (default is 0)
control : string or float, optional
to indicate which category to use as control group. Default is first category found.
Returns
-------
Glass Delata value
Notes
-----
The formula used is (Glass, 1976, p. 7):
$$\\delta = \\frac{\\bar{x}_1 - \\bar{x}_2}{s_2}$$
With:
$$s_2 = \\sqrt{\\frac{\\sum_{i=1}^{n_2} \\left(x_{2,i} - \\bar{x}_2\\right)^2}{n_2 - 1}}$$
$$\\bar{x}_i = \\frac{\\sum_{j=1}^{n_i} x_{i,j}}{n_i}$$
*Symbols used:*
* \\(x_{i,j}\\) the j-th score in category i
* \\(n_i\\) the number of scores in category i
Glass actually uses a ‘control group’ and \\eqn{s_2} is then the standard deviation of the control group.
Before, After and Alternatives
------------------------------
Before the effect size you might want to run a test. Various options include [ts_student_t_os](../tests/test_student_t_os.html#ts_student_t_os) for One-Sample Student t-Test, [ts_trimmed_mean_os](../tests/test_trimmed_mean_os.html#ts_trimmed_mean_os) for One-Sample Trimmed (Yuen or Yuen-Welch) Mean Test, or [ts_z_os](../tests/test_z_os.html#ts_z_os) for One-Sample Z Test.
Unfortunately, I've been unable to find any rule-of-thumb specifically for Glass Delta, but most likely the ones from Cohen d should be a decent alternative. These are available with [th_cohen_d()](../other/thumb_cohen_d.html)
Alternative effect sizes include: [Common Language](../effect_sizes/eff_size_common_language_is.html), [Cohen d_s](../effect_sizes/eff_size_hedges_g_is.html), [Cohen U](../effect_sizes/eff_size_cohen_u.html), [Hedges g](../effect_sizes/eff_size_hedges_g_is.html), [Glass delta](../effect_sizes/eff_size_glass_delta.html)
or the correlation coefficients: [biserial](../correlations/cor_biserial.html), [point-biserial](../effect_sizes/cor_point_biserial.html)
References
----------
Glass, G. V. (1976). Primary, secondary, and meta-analysis of research. *Educational Researcher, 5*(10), 3–8. https://doi.org/10.3102/0013189X005010003
Author
------
Made by P. Stikker
Companion website: https://PeterStatistics.com
YouTube channel: https://www.youtube.com/stikpet
Donations: https://www.patreon.com/bePatron?u=19398076
Examples
--------
Example 1: Dataframe
>>> file1 = "https://peterstatistics.com/Packages/ExampleData/GSS2012a.csv"
>>> df1 = pd.read_csv(file1, sep=',', low_memory=False, storage_options={'User-Agent': 'Mozilla/5.0'})
>>> ex1 = df1['age']
>>> ex1 = ex1.replace("89 OR OLDER", "90")
>>> es_glass_delta(df1['sex'], ex1, control="FEMALE")
0.04509629567422838
Example 2: List
>>> scores = [20,50,80,15,40,85,30,45,70,60, None, 90,25,40,70,65, None, 70,98,40]
>>> groups = ["nat.","int.","int.","nat.","int.", "int.","nat.","nat.","int.","int.","int.","int.","int.","int.","nat.", "int." ,None,"nat.","int.","int."]
>>> es_glass_delta(groups, scores)
0.83604435914283
'''
#convert to pandas series if needed
if type(catField) is list:
catField = pd.Series(catField)
if type(scaleField) is list:
scaleField = pd.Series(scaleField)
#combine as one dataframe
df = pd.concat([catField, scaleField], axis=1)
df = df.dropna()
#the two categories
if categories is not None:
cat1 = categories[0]
cat2 = categories[1]
else:
cat1 = df.iloc[:,0].value_counts().index[0]
cat2 = df.iloc[:,0].value_counts().index[1]
#seperate the scores for each category
x1 = list(df.iloc[:,1][df.iloc[:,0] == cat1])
x2 = list(df.iloc[:,1][df.iloc[:,0] == cat2])
#make sure they are floats
x1 = [float(x) for x in x1]
x2 = [float(x) for x in x2]
n1 = len(x1)
n2 = len(x2)
n = n1 + n2
var1 = variance(x1)
var2 = variance(x2)
m1 = mean(x1)
m2 = mean(x2)
sd1 = (var1)**0.5
sd2 = (var2)**0.5
if control is None or control==cat2:
s = sd2
elif (control==cat1):
s= sd1
gd = (m1- m2)/s
return(gd)
Functions
def es_glass_delta(catField, scaleField, categories=None, dmu=0, control=None)-
Glass Delta
An effect size measure when comparing two means, with a specified control group.
The measure is also described at PeterStatistics.com
Parameters
catField:dataframeorlist- the categorical data
scaleField:dataframeorlist- the scores
categories:list, optional- to indicate which two categories of catField to use, otherwise first two found will be used.
dmu:float, optional- difference according to null hypothesis (default is 0)
control:stringorfloat, optional- to indicate which category to use as control group. Default is first category found.
Returns
Glass Delata value
Notes
The formula used is (Glass, 1976, p. 7): \delta = \frac{\bar{x}_1 - \bar{x}_2}{s_2}
With: s_2 = \sqrt{\frac{\sum_{i=1}^{n_2} \left(x_{2,i} - \bar{x}_2\right)^2}{n_2 - 1}} \bar{x}_i = \frac{\sum_{j=1}^{n_i} x_{i,j}}{n_i}
Symbols used:
- x_{i,j} the j-th score in category i
- n_i the number of scores in category i
Glass actually uses a ‘control group’ and \eqn{s_2} is then the standard deviation of the control group.
Before, After and Alternatives
Before the effect size you might want to run a test. Various options include ts_student_t_os for One-Sample Student t-Test, ts_trimmed_mean_os for One-Sample Trimmed (Yuen or Yuen-Welch) Mean Test, or ts_z_os for One-Sample Z Test.
Unfortunately, I've been unable to find any rule-of-thumb specifically for Glass Delta, but most likely the ones from Cohen d should be a decent alternative. These are available with th_cohen_d()
Alternative effect sizes include: Common Language, Cohen d_s, Cohen U, Hedges g, Glass delta
or the correlation coefficients: biserial, point-biserial
References
Glass, G. V. (1976). Primary, secondary, and meta-analysis of research. Educational Researcher, 5(10), 3–8. https://doi.org/10.3102/0013189X005010003
Author
Made by P. Stikker
Companion website: https://PeterStatistics.com
YouTube channel: https://www.youtube.com/stikpet
Donations: https://www.patreon.com/bePatron?u=19398076Examples
Example 1: Dataframe
>>> file1 = "https://peterstatistics.com/Packages/ExampleData/GSS2012a.csv" >>> df1 = pd.read_csv(file1, sep=',', low_memory=False, storage_options={'User-Agent': 'Mozilla/5.0'}) >>> ex1 = df1['age'] >>> ex1 = ex1.replace("89 OR OLDER", "90") >>> es_glass_delta(df1['sex'], ex1, control="FEMALE") 0.04509629567422838Example 2: List
>>> scores = [20,50,80,15,40,85,30,45,70,60, None, 90,25,40,70,65, None, 70,98,40] >>> groups = ["nat.","int.","int.","nat.","int.", "int.","nat.","nat.","int.","int.","int.","int.","int.","int.","nat.", "int." ,None,"nat.","int.","int."] >>> es_glass_delta(groups, scores) 0.83604435914283Expand source code
def es_glass_delta(catField, scaleField, categories=None, dmu=0, control=None): ''' Glass Delta ----------- An effect size measure when comparing two means, with a specified control group. The measure is also described at [PeterStatistics.com](https://peterstatistics.com/Terms/EffectSizes/GlassDelta.html) Parameters ---------- catField : dataframe or list the categorical data scaleField : dataframe or list the scores categories : list, optional to indicate which two categories of catField to use, otherwise first two found will be used. dmu : float, optional difference according to null hypothesis (default is 0) control : string or float, optional to indicate which category to use as control group. Default is first category found. Returns ------- Glass Delata value Notes ----- The formula used is (Glass, 1976, p. 7): $$\\delta = \\frac{\\bar{x}_1 - \\bar{x}_2}{s_2}$$ With: $$s_2 = \\sqrt{\\frac{\\sum_{i=1}^{n_2} \\left(x_{2,i} - \\bar{x}_2\\right)^2}{n_2 - 1}}$$ $$\\bar{x}_i = \\frac{\\sum_{j=1}^{n_i} x_{i,j}}{n_i}$$ *Symbols used:* * \\(x_{i,j}\\) the j-th score in category i * \\(n_i\\) the number of scores in category i Glass actually uses a ‘control group’ and \\eqn{s_2} is then the standard deviation of the control group. Before, After and Alternatives ------------------------------ Before the effect size you might want to run a test. Various options include [ts_student_t_os](../tests/test_student_t_os.html#ts_student_t_os) for One-Sample Student t-Test, [ts_trimmed_mean_os](../tests/test_trimmed_mean_os.html#ts_trimmed_mean_os) for One-Sample Trimmed (Yuen or Yuen-Welch) Mean Test, or [ts_z_os](../tests/test_z_os.html#ts_z_os) for One-Sample Z Test. Unfortunately, I've been unable to find any rule-of-thumb specifically for Glass Delta, but most likely the ones from Cohen d should be a decent alternative. These are available with [th_cohen_d()](../other/thumb_cohen_d.html) Alternative effect sizes include: [Common Language](../effect_sizes/eff_size_common_language_is.html), [Cohen d_s](../effect_sizes/eff_size_hedges_g_is.html), [Cohen U](../effect_sizes/eff_size_cohen_u.html), [Hedges g](../effect_sizes/eff_size_hedges_g_is.html), [Glass delta](../effect_sizes/eff_size_glass_delta.html) or the correlation coefficients: [biserial](../correlations/cor_biserial.html), [point-biserial](../effect_sizes/cor_point_biserial.html) References ---------- Glass, G. V. (1976). Primary, secondary, and meta-analysis of research. *Educational Researcher, 5*(10), 3–8. https://doi.org/10.3102/0013189X005010003 Author ------ Made by P. Stikker Companion website: https://PeterStatistics.com YouTube channel: https://www.youtube.com/stikpet Donations: https://www.patreon.com/bePatron?u=19398076 Examples -------- Example 1: Dataframe >>> file1 = "https://peterstatistics.com/Packages/ExampleData/GSS2012a.csv" >>> df1 = pd.read_csv(file1, sep=',', low_memory=False, storage_options={'User-Agent': 'Mozilla/5.0'}) >>> ex1 = df1['age'] >>> ex1 = ex1.replace("89 OR OLDER", "90") >>> es_glass_delta(df1['sex'], ex1, control="FEMALE") 0.04509629567422838 Example 2: List >>> scores = [20,50,80,15,40,85,30,45,70,60, None, 90,25,40,70,65, None, 70,98,40] >>> groups = ["nat.","int.","int.","nat.","int.", "int.","nat.","nat.","int.","int.","int.","int.","int.","int.","nat.", "int." ,None,"nat.","int.","int."] >>> es_glass_delta(groups, scores) 0.83604435914283 ''' #convert to pandas series if needed if type(catField) is list: catField = pd.Series(catField) if type(scaleField) is list: scaleField = pd.Series(scaleField) #combine as one dataframe df = pd.concat([catField, scaleField], axis=1) df = df.dropna() #the two categories if categories is not None: cat1 = categories[0] cat2 = categories[1] else: cat1 = df.iloc[:,0].value_counts().index[0] cat2 = df.iloc[:,0].value_counts().index[1] #seperate the scores for each category x1 = list(df.iloc[:,1][df.iloc[:,0] == cat1]) x2 = list(df.iloc[:,1][df.iloc[:,0] == cat2]) #make sure they are floats x1 = [float(x) for x in x1] x2 = [float(x) for x in x2] n1 = len(x1) n2 = len(x2) n = n1 + n2 var1 = variance(x1) var2 = variance(x2) m1 = mean(x1) m2 = mean(x2) sd1 = (var1)**0.5 sd2 = (var2)**0.5 if control is None or control==cat2: s = sd2 elif (control==cat1): s= sd1 gd = (m1- m2)/s return(gd)