Module `stikpetP.effect_sizes.eff_size_hedges_g_is`

Expand source code

from statistics import mean, variance
from math import gamma
import pandas as pd

def es_hedges_g_is(catField, scaleField, categories=None, dmu=0, varWeighted=True, corr=None):
    '''
    Hedges g / Cohen ds (independent samples)
    -----------------------------------------
    An effect size measure when comparing two means. A few different variations are available. See the details for more information on them.

    The measure is also described at [PeterStatistics.com](https://peterstatistics.com/Terms/EffectSizes/HedgesG.html)
    
    Parameters
    ----------
    catField : dataframe or list 
        the categorical data
    scaleField : dataframe or list
        the scores
    categories : list, optional 
        to indicate which two categories of catField to use, otherwise first two found will be used.
    dmu : float, optional 
        difference according to null hypothesis (default is 0)
    varWeighted : boolean, optional 
        to indicate the use of weighted variances or not. Default is True.
    corr : {None, 'exact', 'hedges', 'durlak', 'xue'}, optional 
        approximation to use. None is default 
    
    Returns
    -------
    A dataframe with:
    
    * *g*, the effect size value
    * *version*, description of the effect size calculated
    
    Notes
    ------
    The formula used is (Hedges, 1981, p. 110):
    $$g = \\frac{\\bar{x}_1 - \\bar{x}_2}{s_p}$$
    
    With:
    $$s_p = \\sqrt{\\frac{SS_1^2 + SS_2^2}{n - 2}}$$
    $$SS_i = \\sum_{j=1}^{n_i} \\left(x_{i,j} - \\bar{x}_i\\right)^2$$
    $$\\bar{x}_i = \\frac{\\sum_{j=1}^{n_i} x_{i,j}}{n_i}$$
    
    *Symbols used:*
    
    * \\(x_{i,j}\\) the j-th score in category i
    * \\(n_i\\) the number of scores in category i
    
    This is also what Cohen refers to as \\(d_s\\) (Cohen, 1988, p. 66).
    
    This uses by default the formula as shown above for \\(s_p\\). However, sometimes the unweighted version is used. If *varWeighted=FALSE* the following will be used instead:
     $$s_p = \\sqrt{\\frac{s_1^2 + s_2^2}{2}}$$
    
    Hedges proposes the following exact bias correction (Hedges, 1981, p. 111):
    $$g_{c} = g \\times\\frac{\\Gamma\\left(m\\right)}{\\Gamma\\left(m - \\frac{1}{2}\\right)\\times\\sqrt{m}}$$
    With:
    $$m = \\frac{df}{2}$$
    $$df = n_1 + n_2 - 2= n - 2$$
    
    *Symbols used:*
    
    * \\(df\\) the degrees of freedom
    * \\(n\\) the sample size (i.e. the number of scores)
    * \\(\\Gamma\\left(\\dots\\right)\\) the gamma function
    
    The formula used for the approximation for this correction from Hedges (1981, p. 114) (appr="hedges"):
    $$g_c = g \\times\\left(1 - \\frac{3}{4\\times df - 1}\\right)$$
    
    This approximation can also be found in Hedges and Olkin (1985, p. 81) and
    Cohen (1988, p. 66)
    
    The formula used for the approximation from Durlak (2009, p. 927) (appr="durlak"):
    $$g_c = g \\times\\frac{n - 3}{n - 2.25} \\times\\sqrt{\\frac{n - 2}{n}}$$
    
    The formula used for the approximation from Xue (2020, p. 3) (appr="xue"):
    $$g_c = g \\times \\sqrt[12]{1 - \\frac{9}{df} + \\frac{69}{2\\times df^2} - \\frac{72}{df^3} + \\frac{687}{8\\times df^4} - \\frac{441}{8\\times df^5} + \\frac{247}{16\\times df^6}}$$

    Before, After and Alternatives
    ------------------------------
    Before the effect size you might want to run a test. Various options include [ts_student_t_os](../tests/test_student_t_os.html#ts_student_t_os) for One-Sample Student t-Test, [ts_trimmed_mean_os](../tests/test_trimmed_mean_os.html#ts_trimmed_mean_os) for One-Sample Trimmed (Yuen or Yuen-Welch) Mean Test, or [ts_z_os](../tests/test_z_os.html#ts_z_os) for One-Sample Z Test.

    After obtaining the measure, you might want to use the rules-of-thumb for Cohen d<sub>s</sub>: [th_cohen_d()](../other/thumb_cohen_d.html). 
    
    Alternative effect sizes include: [Common Language](../effect_sizes/eff_size_common_language_is.html), [Cohen d_s](../effect_sizes/eff_size_hedges_g_is.html), [Cohen U](../effect_sizes/eff_size_cohen_u.html), [Hedges g](../effect_sizes/eff_size_hedges_g_is.html), [Glass delta](../effect_sizes/eff_size_glass_delta.html)
    
    or the correlation coefficients: [biserial](../correlations/cor_biserial.html), [point-biserial](../effect_sizes/cor_point_biserial.html)
    
    References 
    ----------
    Cohen, J. (1988). *Statistical power analysis for the behavioral sciences* (2nd ed.). L. Erlbaum Associates.
    
    Durlak, J. A. (2009). How to select, calculate, and interpret effect sizes. *Journal of Pediatric Psychology, 34*(9), 917–928. https://doi.org/10.1093/jpepsy/jsp004
    
    Hedges, L. V. (1981). Distribution Theory for Glass’s Estimator of Effect Size and Related Estimators. *Journal of Educational Statistics, 6*(2), 107–128. https://doi.org/10.2307/1164588
    
    Hedges, L. V., & Olkin, I. (1985). *Statistical methods for meta-analysis*. Academic Press.
    
    Xue, X. (2020). Improved approximations of Hedges’ g*. https://doi.org/10.48550/arXiv.2003.06675
    
    Author
    ------
    Made by P. Stikker
    
    Companion website: https://PeterStatistics.com  
    YouTube channel: https://www.youtube.com/stikpet  
    Donations: https://www.patreon.com/bePatron?u=19398076
    
    Examples
    --------
    Example 1: Dataframe
    >>> file1 = "https://peterstatistics.com/Packages/ExampleData/GSS2012a.csv"
    >>> df1 = pd.read_csv(file1, sep=',', low_memory=False, storage_options={'User-Agent': 'Mozilla/5.0'})
    >>> ex1 = df1['age']
    >>> ex1 = ex1.replace("89 OR OLDER", "90")
    >>> print(es_hedges_g_is(df1['sex'], ex1, categories=["MALE", "FEMALE"]))
              g                           version
    0 -0.045224  Cohen ds (Hedges g (uncorrected)
    >>> print(es_hedges_g_is(df1['sex'], ex1, categories=["MALE", "FEMALE"], corr="hedges"))
              g                   version
    0 -0.045206  Hedges g (approximation)
    >>> print(es_hedges_g_is(df1['sex'], ex1, categories=["MALE", "FEMALE"], corr="durlak"))
              g                             version
    0 -0.045183  Hedges g with Durlak approximation
    >>> print(es_hedges_g_is(df1['sex'], ex1, categories=["MALE", "FEMALE"], corr="xue"))
              g                          version
    0 -0.045206  Hedges g with Xue approximation
    
    Example 2: List
    >>> scores = [20,50,80,15,40,85,30,45,70,60, None, 90,25,40,70,65, None, 70,98,40]
    >>> groups = ["nat.","int.","int.","nat.","int.", "int.","nat.","nat.","int.","int.","int.","int.","int.","int.","nat.", "int." ,None,"nat.","int.","int."]
    >>> es_hedges_g_is(groups, scores)
              g                           version
    0  0.858201  Cohen ds (Hedges g (uncorrected)
    
    '''
    
    #convert to pandas series if needed
    if type(catField) is list:
        catField = pd.Series(catField)
    
    if type(scaleField) is list:
        scaleField = pd.Series(scaleField)
    
    #combine as one dataframe
    df = pd.concat([catField, scaleField], axis=1)
    df = df.dropna()
    
    #the two categories
    if categories is not None:
        cat1 = categories[0]
        cat2 = categories[1]
    else:
        cat1 = df.iloc[:,0].value_counts().index[0]
        cat2 = df.iloc[:,0].value_counts().index[1]
    
    #seperate the scores for each category
    x1 = list(df.iloc[:,1][df.iloc[:,0] == cat1])
    x2 = list(df.iloc[:,1][df.iloc[:,0] == cat2])
    
    #make sure they are floats
    x1 = [float(x) for x in x1]
    x2 = [float(x) for x in x2]
    
    n1 = len(x1)
    n2 = len(x2)
    n = n1 + n2
    
    var1 = variance(x1)
    var2 = variance(x2)
    m1 = mean(x1)
    m2 = mean(x2)
    
    sd1 = (var1)**0.5
    sd2 = (var2)**0.5
    
    #Determine Sum of Squared (deviation from the mean) per category
    ss1 = sd1**2*(n1 - 1)
    ss2 = sd2**2*(n2 - 1)
    
    if varWeighted:
        se = ((ss1 + ss2)/(n - 2))**0.5
    else:
        se = ((var1 + var2)/2)**0.5
        
    #Determine Hedges g (Cohen's d)
    g = (m1 - m2-dmu)/se
    
    c = 1
    comment = "Cohen ds (Hedges g (uncorrected)"
    if corr is not None:
        if (corr=="exact"):
            if (n - 2 < 171):
                c = gamma((n - 2)/2)/(((n - 2)/2)**0.5 * gamma((n - 3)/2))
                comment = "Hedges g (exact method)"
            else:
                print("WARNING: exact method could not be computed due to large sample size, approximation used instead")
                c = 1 - 3/(4*(n - 2) - 1)
                comment = "Hedges g (approximation)"
        elif(corr=="hedges"):
            c = 1 - 3/(4*(n - 2) - 1)
            comment = "Hedges g (approximation)"
        elif(corr=="durlak"):
            c = (n - 3)/(n - 2.25)*((n - 2)/n)**0.5
            comment = "Hedges g with Durlak approximation"
        elif(corr=="xue"):
            # Xue (2020, p. 3) approximation:
            df = n - 2
            c = (1 - 9/df + 69/(2*df**2) - 72/(df**3) + 687/(8*df**4) - 441/(8*df**5) + 247/(16*df**6))**(1/12)
            comment = "Hedges g with Xue approximation"
        
    g = g*c
    
    #the results
    colnames = ["g", "version"]
    results = pd.DataFrame([[g, comment]], columns=colnames)
    
    return(results)

Functions

def es_hedges_g_is(catField, scaleField, categories=None, dmu=0, varWeighted=True, corr=None)

Hedges g / Cohen ds (independent samples)

An effect size measure when comparing two means. A few different variations are available. See the details for more information on them.

The measure is also described at PeterStatistics.com

Parameters

catField : dataframe or list: the categorical data
scaleField : dataframe or list: the scores
categories : list, optional: to indicate which two categories of catField to use, otherwise first two found will be used.
dmu : float, optional: difference according to null hypothesis (default is 0)
varWeighted : boolean, optional: to indicate the use of weighted variances or not. Default is True.
corr : {None, 'exact', 'hedges', 'durlak', 'xue'}, optional: approximation to use. None is default

Returns

A dataframe with:

g, the effect size value
version, description of the effect size calculated

Notes

The formula used is (Hedges, 1981, p. 110): $g = \frac{\bar{x}_1 - \bar{x}_2}{s_p}$

With: $s_p = \sqrt{\frac{SS_1^2 + SS_2^2}{n - 2}}$ $SS_i = \sum_{j=1}^{n_i} \left(x_{i,j} - \bar{x}_i\right)^2$ $\bar{x}_i = \frac{\sum_{j=1}^{n_i} x_{i,j}}{n_i}$

Symbols used:

$x_{i,j}$ the j-th score in category i
$n_i$ the number of scores in category i

This is also what Cohen refers to as $d_s$ (Cohen, 1988, p. 66).

This uses by default the formula as shown above for $s_p$ . However, sometimes the unweighted version is used. If varWeighted=FALSE the following will be used instead: $s_p = \sqrt{\frac{s_1^2 + s_2^2}{2}}$

Hedges proposes the following exact bias correction (Hedges, 1981, p. 111): $g_{c} = g \times\frac{\Gamma\left(m\right)}{\Gamma\left(m - \frac{1}{2}\right)\times\sqrt{m}}$ With: $m = \frac{df}{2}$ $df = n_1 + n_2 - 2= n - 2$

Symbols used:

$df$ the degrees of freedom
$n$ the sample size (i.e. the number of scores)
$\Gamma\left(\dots\right)$ the gamma function

The formula used for the approximation for this correction from Hedges (1981, p. 114) (appr="hedges"): $g_c = g \times\left(1 - \frac{3}{4\times df - 1}\right)$

This approximation can also be found in Hedges and Olkin (1985, p. 81) and Cohen (1988, p. 66)

The formula used for the approximation from Durlak (2009, p. 927) (appr="durlak"): $g_c = g \times\frac{n - 3}{n - 2.25} \times\sqrt{\frac{n - 2}{n}}$

The formula used for the approximation from Xue (2020, p. 3) (appr="xue"): $g_c = g \times \sqrt[12]{1 - \frac{9}{df} + \frac{69}{2\times df^2} - \frac{72}{df^3} + \frac{687}{8\times df^4} - \frac{441}{8\times df^5} + \frac{247}{16\times df^6}}$

Before, After and Alternatives

Before the effect size you might want to run a test. Various options include ts_student_t_os for One-Sample Student t-Test, ts_trimmed_mean_os for One-Sample Trimmed (Yuen or Yuen-Welch) Mean Test, or ts_z_os for One-Sample Z Test.

After obtaining the measure, you might want to use the rules-of-thumb for Cohen d_s: th_cohen_d().

Alternative effect sizes include: Common Language, Cohen d_s, Cohen U, Hedges g, Glass delta

or the correlation coefficients: biserial, point-biserial

References

Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). L. Erlbaum Associates.

Durlak, J. A. (2009). How to select, calculate, and interpret effect sizes. Journal of Pediatric Psychology, 34(9), 917–928. https://doi.org/10.1093/jpepsy/jsp004

Hedges, L. V. (1981). Distribution Theory for Glass’s Estimator of Effect Size and Related Estimators. Journal of Educational Statistics, 6(2), 107–128. https://doi.org/10.2307/1164588

Hedges, L. V., & Olkin, I. (1985). Statistical methods for meta-analysis. Academic Press.

Xue, X. (2020). Improved approximations of Hedges’ g*. https://doi.org/10.48550/arXiv.2003.06675

Author

Made by P. Stikker

Companion website: https://PeterStatistics.com
YouTube channel: https://www.youtube.com/stikpet
Donations: https://www.patreon.com/bePatron?u=19398076

Examples

Example 1: Dataframe

>>> file1 = "https://peterstatistics.com/Packages/ExampleData/GSS2012a.csv"
>>> df1 = pd.read_csv(file1, sep=',', low_memory=False, storage_options={'User-Agent': 'Mozilla/5.0'})
>>> ex1 = df1['age']
>>> ex1 = ex1.replace("89 OR OLDER", "90")
>>> print(es_hedges_g_is(df1['sex'], ex1, categories=["MALE", "FEMALE"]))
          g                           version
0 -0.045224  Cohen ds (Hedges g (uncorrected)
>>> print(es_hedges_g_is(df1['sex'], ex1, categories=["MALE", "FEMALE"], corr="hedges"))
          g                   version
0 -0.045206  Hedges g (approximation)
>>> print(es_hedges_g_is(df1['sex'], ex1, categories=["MALE", "FEMALE"], corr="durlak"))
          g                             version
0 -0.045183  Hedges g with Durlak approximation
>>> print(es_hedges_g_is(df1['sex'], ex1, categories=["MALE", "FEMALE"], corr="xue"))
          g                          version
0 -0.045206  Hedges g with Xue approximation

Example 2: List

>>> scores = [20,50,80,15,40,85,30,45,70,60, None, 90,25,40,70,65, None, 70,98,40]
>>> groups = ["nat.","int.","int.","nat.","int.", "int.","nat.","nat.","int.","int.","int.","int.","int.","int.","nat.", "int." ,None,"nat.","int.","int."]
>>> es_hedges_g_is(groups, scores)
          g                           version
0  0.858201  Cohen ds (Hedges g (uncorrected)

Expand source code

def es_hedges_g_is(catField, scaleField, categories=None, dmu=0, varWeighted=True, corr=None):
    '''
    Hedges g / Cohen ds (independent samples)
    -----------------------------------------
    An effect size measure when comparing two means. A few different variations are available. See the details for more information on them.

    The measure is also described at [PeterStatistics.com](https://peterstatistics.com/Terms/EffectSizes/HedgesG.html)
    
    Parameters
    ----------
    catField : dataframe or list 
        the categorical data
    scaleField : dataframe or list
        the scores
    categories : list, optional 
        to indicate which two categories of catField to use, otherwise first two found will be used.
    dmu : float, optional 
        difference according to null hypothesis (default is 0)
    varWeighted : boolean, optional 
        to indicate the use of weighted variances or not. Default is True.
    corr : {None, 'exact', 'hedges', 'durlak', 'xue'}, optional 
        approximation to use. None is default 
    
    Returns
    -------
    A dataframe with:
    
    * *g*, the effect size value
    * *version*, description of the effect size calculated
    
    Notes
    ------
    The formula used is (Hedges, 1981, p. 110):
    $$g = \\frac{\\bar{x}_1 - \\bar{x}_2}{s_p}$$
    
    With:
    $$s_p = \\sqrt{\\frac{SS_1^2 + SS_2^2}{n - 2}}$$
    $$SS_i = \\sum_{j=1}^{n_i} \\left(x_{i,j} - \\bar{x}_i\\right)^2$$
    $$\\bar{x}_i = \\frac{\\sum_{j=1}^{n_i} x_{i,j}}{n_i}$$
    
    *Symbols used:*
    
    * \\(x_{i,j}\\) the j-th score in category i
    * \\(n_i\\) the number of scores in category i
    
    This is also what Cohen refers to as \\(d_s\\) (Cohen, 1988, p. 66).
    
    This uses by default the formula as shown above for \\(s_p\\). However, sometimes the unweighted version is used. If *varWeighted=FALSE* the following will be used instead:
     $$s_p = \\sqrt{\\frac{s_1^2 + s_2^2}{2}}$$
    
    Hedges proposes the following exact bias correction (Hedges, 1981, p. 111):
    $$g_{c} = g \\times\\frac{\\Gamma\\left(m\\right)}{\\Gamma\\left(m - \\frac{1}{2}\\right)\\times\\sqrt{m}}$$
    With:
    $$m = \\frac{df}{2}$$
    $$df = n_1 + n_2 - 2= n - 2$$
    
    *Symbols used:*
    
    * \\(df\\) the degrees of freedom
    * \\(n\\) the sample size (i.e. the number of scores)
    * \\(\\Gamma\\left(\\dots\\right)\\) the gamma function
    
    The formula used for the approximation for this correction from Hedges (1981, p. 114) (appr="hedges"):
    $$g_c = g \\times\\left(1 - \\frac{3}{4\\times df - 1}\\right)$$
    
    This approximation can also be found in Hedges and Olkin (1985, p. 81) and
    Cohen (1988, p. 66)
    
    The formula used for the approximation from Durlak (2009, p. 927) (appr="durlak"):
    $$g_c = g \\times\\frac{n - 3}{n - 2.25} \\times\\sqrt{\\frac{n - 2}{n}}$$
    
    The formula used for the approximation from Xue (2020, p. 3) (appr="xue"):
    $$g_c = g \\times \\sqrt[12]{1 - \\frac{9}{df} + \\frac{69}{2\\times df^2} - \\frac{72}{df^3} + \\frac{687}{8\\times df^4} - \\frac{441}{8\\times df^5} + \\frac{247}{16\\times df^6}}$$

    Before, After and Alternatives
    ------------------------------
    Before the effect size you might want to run a test. Various options include [ts_student_t_os](../tests/test_student_t_os.html#ts_student_t_os) for One-Sample Student t-Test, [ts_trimmed_mean_os](../tests/test_trimmed_mean_os.html#ts_trimmed_mean_os) for One-Sample Trimmed (Yuen or Yuen-Welch) Mean Test, or [ts_z_os](../tests/test_z_os.html#ts_z_os) for One-Sample Z Test.

    After obtaining the measure, you might want to use the rules-of-thumb for Cohen d<sub>s</sub>: [th_cohen_d()](../other/thumb_cohen_d.html). 
    
    Alternative effect sizes include: [Common Language](../effect_sizes/eff_size_common_language_is.html), [Cohen d_s](../effect_sizes/eff_size_hedges_g_is.html), [Cohen U](../effect_sizes/eff_size_cohen_u.html), [Hedges g](../effect_sizes/eff_size_hedges_g_is.html), [Glass delta](../effect_sizes/eff_size_glass_delta.html)
    
    or the correlation coefficients: [biserial](../correlations/cor_biserial.html), [point-biserial](../effect_sizes/cor_point_biserial.html)
    
    References 
    ----------
    Cohen, J. (1988). *Statistical power analysis for the behavioral sciences* (2nd ed.). L. Erlbaum Associates.
    
    Durlak, J. A. (2009). How to select, calculate, and interpret effect sizes. *Journal of Pediatric Psychology, 34*(9), 917–928. https://doi.org/10.1093/jpepsy/jsp004
    
    Hedges, L. V. (1981). Distribution Theory for Glass’s Estimator of Effect Size and Related Estimators. *Journal of Educational Statistics, 6*(2), 107–128. https://doi.org/10.2307/1164588
    
    Hedges, L. V., & Olkin, I. (1985). *Statistical methods for meta-analysis*. Academic Press.
    
    Xue, X. (2020). Improved approximations of Hedges’ g*. https://doi.org/10.48550/arXiv.2003.06675
    
    Author
    ------
    Made by P. Stikker
    
    Companion website: https://PeterStatistics.com  
    YouTube channel: https://www.youtube.com/stikpet  
    Donations: https://www.patreon.com/bePatron?u=19398076
    
    Examples
    --------
    Example 1: Dataframe
    >>> file1 = "https://peterstatistics.com/Packages/ExampleData/GSS2012a.csv"
    >>> df1 = pd.read_csv(file1, sep=',', low_memory=False, storage_options={'User-Agent': 'Mozilla/5.0'})
    >>> ex1 = df1['age']
    >>> ex1 = ex1.replace("89 OR OLDER", "90")
    >>> print(es_hedges_g_is(df1['sex'], ex1, categories=["MALE", "FEMALE"]))
              g                           version
    0 -0.045224  Cohen ds (Hedges g (uncorrected)
    >>> print(es_hedges_g_is(df1['sex'], ex1, categories=["MALE", "FEMALE"], corr="hedges"))
              g                   version
    0 -0.045206  Hedges g (approximation)
    >>> print(es_hedges_g_is(df1['sex'], ex1, categories=["MALE", "FEMALE"], corr="durlak"))
              g                             version
    0 -0.045183  Hedges g with Durlak approximation
    >>> print(es_hedges_g_is(df1['sex'], ex1, categories=["MALE", "FEMALE"], corr="xue"))
              g                          version
    0 -0.045206  Hedges g with Xue approximation
    
    Example 2: List
    >>> scores = [20,50,80,15,40,85,30,45,70,60, None, 90,25,40,70,65, None, 70,98,40]
    >>> groups = ["nat.","int.","int.","nat.","int.", "int.","nat.","nat.","int.","int.","int.","int.","int.","int.","nat.", "int." ,None,"nat.","int.","int."]
    >>> es_hedges_g_is(groups, scores)
              g                           version
    0  0.858201  Cohen ds (Hedges g (uncorrected)
    
    '''
    
    #convert to pandas series if needed
    if type(catField) is list:
        catField = pd.Series(catField)
    
    if type(scaleField) is list:
        scaleField = pd.Series(scaleField)
    
    #combine as one dataframe
    df = pd.concat([catField, scaleField], axis=1)
    df = df.dropna()
    
    #the two categories
    if categories is not None:
        cat1 = categories[0]
        cat2 = categories[1]
    else:
        cat1 = df.iloc[:,0].value_counts().index[0]
        cat2 = df.iloc[:,0].value_counts().index[1]
    
    #seperate the scores for each category
    x1 = list(df.iloc[:,1][df.iloc[:,0] == cat1])
    x2 = list(df.iloc[:,1][df.iloc[:,0] == cat2])
    
    #make sure they are floats
    x1 = [float(x) for x in x1]
    x2 = [float(x) for x in x2]
    
    n1 = len(x1)
    n2 = len(x2)
    n = n1 + n2
    
    var1 = variance(x1)
    var2 = variance(x2)
    m1 = mean(x1)
    m2 = mean(x2)
    
    sd1 = (var1)**0.5
    sd2 = (var2)**0.5
    
    #Determine Sum of Squared (deviation from the mean) per category
    ss1 = sd1**2*(n1 - 1)
    ss2 = sd2**2*(n2 - 1)
    
    if varWeighted:
        se = ((ss1 + ss2)/(n - 2))**0.5
    else:
        se = ((var1 + var2)/2)**0.5
        
    #Determine Hedges g (Cohen's d)
    g = (m1 - m2-dmu)/se
    
    c = 1
    comment = "Cohen ds (Hedges g (uncorrected)"
    if corr is not None:
        if (corr=="exact"):
            if (n - 2 < 171):
                c = gamma((n - 2)/2)/(((n - 2)/2)**0.5 * gamma((n - 3)/2))
                comment = "Hedges g (exact method)"
            else:
                print("WARNING: exact method could not be computed due to large sample size, approximation used instead")
                c = 1 - 3/(4*(n - 2) - 1)
                comment = "Hedges g (approximation)"
        elif(corr=="hedges"):
            c = 1 - 3/(4*(n - 2) - 1)
            comment = "Hedges g (approximation)"
        elif(corr=="durlak"):
            c = (n - 3)/(n - 2.25)*((n - 2)/n)**0.5
            comment = "Hedges g with Durlak approximation"
        elif(corr=="xue"):
            # Xue (2020, p. 3) approximation:
            df = n - 2
            c = (1 - 9/df + 69/(2*df**2) - 72/(df**3) + 687/(8*df**4) - 441/(8*df**5) + 247/(16*df**6))**(1/12)
            comment = "Hedges g with Xue approximation"
        
    g = g*c
    
    #the results
    colnames = ["g", "version"]
    results = pd.DataFrame([[g, comment]], columns=colnames)
    
    return(results)