Module stikpetP.measures.meas_mean

Expand source code
import math
from numpy import log
import pandas as pd
from .meas_quantiles import me_quantiles

def me_mean(data, levels=None, version="arithmetic", trimProp=0.1, trimFrac="down"):
    '''
    Mean
    ----
    
    Different types of means can be determined using this function. 
    The mean is a measure of central tendency, to indicate the center.

    This function is shown in this [YouTube video](https://youtu.be/_imyahvt-qE) and the measure is also described at [PeterStatistics.com](https://peterstatistics.com/Terms/Measures/mean.html)
    
    Parameters
    ----------
    data : list or pandas data series 
        numeric data
    levels : dictionary, optional 
        coding to use
    version : {"arithmetic", "winsorized", "trimmed", "windsor", "truncated", "olympic", "geometric", "harmonic", "midrange", "decile"}, optional
            mean to calculate. Default is "arithmetic"
    trimProp : float, optional
        indicate the total proportion to trim. Default at 0.1 i.e. 0.05 from each side.
    trimFrac : {"down", "prop", "linear"}, optional 
        parameter to indicate what to do if trimmed amount is non-integer. Default is "down"
    
    Returns
    -------
    res : float
        value of the mean
    
    Notes
    -----
    
    **Arithmetic Mean**
    
    One of the three Pythagorean means, and the mean most people would assume if you ask them to calculate the mean.
    It is the fulcrum of the distribution (Weinberg & Schumaker, 1962, p.19). One reference can for example be found 
    in Aristotle (384-322 BC) (1850, p. 43). 
    
    The formula:
    $$\\bar{x} = \\frac{\\sum_{i=1}^n x_i}{n}$$
    
    **Harmonic Mean**
    
    The second of the three Pythagorean means:
    $$H = \\frac{n}{\\sum_{i=1}^n \\frac{1}{x_i}}$$
    
    **Geometric Mean**
    
    The third of the three Pythagorean means:
    $$G = e^{\\frac{1}{n}\\times\\sum_{i=1}^n \\ln\\left(x_i\\right)}$$
    
    **Olympic Mean**
    
    Simply ignore the maximum and minimum (only once) (Louis et al., 2023, p. 117):
    $$OM = \\frac{\\sum_{i=2}^{n-1} x_i}{n - 2}$$
    
    **Mid Range**
    
    The average of the maximum and minimum (Lovitt & Holtzclaw, 1931, p. 91):
    $$MR = \\frac{\\min x + \\max x}{2}$$
    
    **Trimmed**
    
    With a trimmed (Windsor/Truncated) mean we trim a fixed amount of scores from each side (Tukey, 1962, p. 17).
    Let $p_t$ be the proportion to trim, we then need to trim $n_t = \\frac{p_t\\times n}{2}$ 
    from each side.
    
    If this $n_t$ is an integer there isn't a problem, but if it isn't we have options. The 
    first option is to simply round down, i.e. $n_l = \\lfloor n_t\\rfloor$. The trimmed mean is then:
    
    $$\\bar{x}_t = \\frac{\\sum_{i=n_t+1}^{n - n_l + 1} x_i}{n - 2\\times n_l}$$
    This is used if *trimFrac = "down"* is set.
    
    We could also use linear interpolation based on the number of scores to trim. We missed out on:
    $f = n_t - n_l$ on each side. So the first and last value we do include should 
    only count for $1 - f$ each. The trimmed mean will then be:
    $$\\bar{x}_t = \\frac{\\left(x_{n_t + 1} + x_{n - n_l + 1}\\right)\\times\\left(1 - f\\right) + \\sum_{i=n_l+2}^{n - n_l} x_i}{n - 2\\times n_t}$$
    This is used if *trimFrac = "prop"* is set.

    Alternative, we could take the proportion itself and use linear interpolation on that. The found $n_l$ 
    will be $p_1 = \\frac{n_l \\times 2}{n}$ of the the total sample size. While if we had rounded up, we had 
    used $p_2 = \\frac{\\left(n_l + 1\\right)\\times 2}{n}$ of the the total sample size. Using linear interpolation we 
    then get:
    $$\\bar{x}_t = \\frac{p_t - p_1}{p_2 - p_1}\\times\\left(\\bar{x}_{th}-\\bar{x}_{tl}\\right) + \\bar{x}_{tl}$$
    Where $\\bar{x}_{tl}$ is the trimmed mean if $p_1$ would be used as a trim proportion, and $\\bar{x}_{th}$ is the 
    trimmed mean if $p_2$ would be used.
    This is used if *trimFrac = "linear"* is set.
    
    **Winsorized Mean**
    
    Similar as with a trimmed mean, but now the data is not removed, but replaced by the value equal to the nearest 
    value that is still included (Winsor as cited in Dixon, 1960, p. 385).
    $$W = \\frac{n_l \\times \\left(x_{n_l + 1} + x_{n - n_l}\\right) + \\sum_{n_l + 1}^{n - n_l} x_i}{n}$$
    
    **Decile Mean**
    
    The formula used is (Rana et al., 2012, p. 480)
    $$DM = \\frac{\\sum_{i=1}^k D_i}{9}$$
    
    Where $D_i$ is the i-th decile score (quantiles with k=10).
    
    Before, After and Alternatives
    ------------------------------
    Before this you might want to create a binned frequency table or a visualisation:
    * [tab_frequency_bins](../other/table_frequency_bins.html#tab_frequency_bins) to create a binned frequency table
    * [vi_boxplot_single](../visualisations/vis_boxplot_single.html#vi_boxplot_single) for a Box (and Whisker) Plot
    * [vi_histogram](../visualisations/vis_histogram.html#vi_histogram) for a Histogram
    * [vi_stem_and_leaf](../visualisations/vis_stem_and_leaf.html#vi_stem_and_leaf) for a Stem-and-Leaf Display

    After this you might want some other descriptive measures:
    * [me_mode_bin](../measures/meas_mode_bin.html#me_mode_bin) for Mode for Binned Data
    * [me_variation](../measures/meas_variation.html#me_variation) for different Measures of Quantitative Variation
    
    Or a perform a test:
    * [ts_student_t_os](../tests/test_student_t_os.html#ts_student_t_os) for One-Sample Student t-Test
    * [ts_trimmed_mean_os](../tests/test_trimmed_mean_os.html#ts_trimmed_mean_os) for One-Sample Trimmed (Yuen or Yuen-Welch) Mean Test
    * [ts_z_os](../tests/test_z_os.html#ts_z_os) for One-Sample Z Test
    
    
    References
    ----------
    Aristotle. (1850). *The nicomachean ethics of Aristotle* (R. W. Browne, Trans.). Henry G. Bohn.
    
    Dixon, W. J. (1960). Simplified estimation from censored normal samples. *The Annals of Mathematical Statistics, 31*(2), 385–391. doi:10.1214/aoms/1177705900
    
    Louis, P., Núñez, M., & Xefteris, D. (2023). Trimming extreme reports in preference aggregation. *Games and Economic Behavior, 137*, 116–151. doi:10.1016/j.geb.2022.11.003
    
    Lovitt, W. V., & Holtzclaw, H. F. (1931). *Statistics*. Prentice Hall.
    
    Rana, S., Doulah, Md. S., Midi, H., & Imon, A. (2012). Decile mean: A new robust measure of central tendency. *Chiang Mai Journal of Science, 39*, 478–485.

    Tukey, J. W. (1962). The future of data analysis. *The Annals of Mathematical Statistics, 33*(1), 1–67. doi:10.1214/aoms/1177704711
    
    Weinberg, G. H., & Schumaker, J. A. (1962). *Statistics An intuitive approach*. Wadsworth Publishing.
    
    Author
    ------
    Made by P. Stikker
    
    Companion website: https://PeterStatistics.com  
    YouTube channel: https://www.youtube.com/stikpet  
    Donations: https://www.patreon.com/bePatron?u=19398076
    
    Examples
    --------
    Example 1: Numeric Pandas Series
    >>> import pandas as pd
    >>> df2 = pd.read_csv('https://peterstatistics.com/Packages/ExampleData/StudentStatistics.csv', sep=';', low_memory=False, storage_options={'User-Agent': 'Mozilla/5.0'})
    >>> ex1 = df2['Gen_Age']
    >>> me_mean(ex1)
    np.float64(24.454545454545453)
    
    Example 2: Numeric list
    >>> ex2 = [1, 1, 1, 2, 2, 2, 3, 3, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5]
    >>> me_mean(ex2)
    np.float64(3.4444444444444446)

    '''
    if type(data) is list:
        data = pd.Series(data)
        
    data = data.dropna()
    if levels is not None:
        dataN = data.replace(levels)
        dataN = pd.to_numeric(dataN)
    else:
        dataN = pd.to_numeric(data)
    
    dataN = dataN.sort_values().reset_index(drop=True)    
    n = len(dataN)
        
    if version=="arithmetic":
        res = dataN.mean()
    elif version in ["winsorized", "trimmed", "windsor", "truncated"]:
        dataN = pd.to_numeric(dataN)
        dataN = dataN.sort_values()
        dataN = dataN.reset_index(drop=True)
        n = len(dataN)
        nt1 = n*trimProp/2
        nl = math.floor(nt1)
        
        if version=="winsorized":
            dataN[0:nl] = dataN[nl]
            dataN[n-nl:n] = dataN[n-nl-1]
            res = dataN.mean()
        
        else:
            if trimFrac=="down":
                res = dataN[nl:(n-nl)].mean()
            elif trimFrac=="prop":
                fr = nt1 - nl
                res = (dataN[nl]*(1 - fr) + dataN[n-nl-1]*(1 - fr) + sum(dataN[nl+1:(n-nl-1)]))/(n - nt1*2)
            elif trimFrac=="linear":
                p1 = nl*2/n
                p2 = (nl + 1)*2/n
                m1 = dataN[nl:(n-nl)].mean()
                m2 = dataN[(nl+1):(n-nl-1)].mean()
                res = (trimProp - p1)/(p2 - p1)*(m2 - m1)+m1
    
    elif version=="olympic":
        res = (dataN.sum() - max(dataN) - min(dataN))/(len(dataN)-2)
        
    elif version=="geometric":
        n = len(dataN)
        res = math.exp(sum(log(dataN))/n)
        
    elif version=="harmonic":
        n = len(dataN)
        res = n/sum(1/dataN)
        
    elif version=="midrange":
        res = (max(dataN) + min(dataN))/2
        
    elif version=="decile":
        deci = me_quantiles(dataN, k=10)
        dm = 0
        for i in range(1, 10):
            dm = dm + deci[i]
        dm = dm/9
        res = dm
    
    return res

Functions

def me_mean(data, levels=None, version='arithmetic', trimProp=0.1, trimFrac='down')

Mean

Different types of means can be determined using this function. The mean is a measure of central tendency, to indicate the center.

This function is shown in this YouTube video and the measure is also described at PeterStatistics.com

Parameters

data : list or pandas data series
numeric data
levels : dictionary, optional
coding to use
version : {"arithmetic", "winsorized", "trimmed", "windsor", "truncated", "olympic", "geometric", "harmonic", "midrange", "decile"}, optional
mean to calculate. Default is "arithmetic"
trimProp : float, optional
indicate the total proportion to trim. Default at 0.1 i.e. 0.05 from each side.
trimFrac : {"down", "prop", "linear"}, optional
parameter to indicate what to do if trimmed amount is non-integer. Default is "down"

Returns

res : float
value of the mean

Notes

Arithmetic Mean

One of the three Pythagorean means, and the mean most people would assume if you ask them to calculate the mean. It is the fulcrum of the distribution (Weinberg & Schumaker, 1962, p.19). One reference can for example be found in Aristotle (384-322 BC) (1850, p. 43).

The formula: \bar{x} = \frac{\sum_{i=1}^n x_i}{n}

Harmonic Mean

The second of the three Pythagorean means: H = \frac{n}{\sum_{i=1}^n \frac{1}{x_i}}

Geometric Mean

The third of the three Pythagorean means: G = e^{\frac{1}{n}\times\sum_{i=1}^n \ln\left(x_i\right)}

Olympic Mean

Simply ignore the maximum and minimum (only once) (Louis et al., 2023, p. 117): OM = \frac{\sum_{i=2}^{n-1} x_i}{n - 2}

Mid Range

The average of the maximum and minimum (Lovitt & Holtzclaw, 1931, p. 91): MR = \frac{\min x + \max x}{2}

Trimmed

With a trimmed (Windsor/Truncated) mean we trim a fixed amount of scores from each side (Tukey, 1962, p. 17). Let $p_t$ be the proportion to trim, we then need to trim $n_t = \frac{p_t\times n}{2}$ from each side.

If this $n_t$ is an integer there isn't a problem, but if it isn't we have options. The first option is to simply round down, i.e. $n_l = \lfloor n_t\rfloor$. The trimmed mean is then:

\bar{x}_t = \frac{\sum_{i=n_t+1}^{n - n_l + 1} x_i}{n - 2\times n_l} This is used if trimFrac = "down" is set.

We could also use linear interpolation based on the number of scores to trim. We missed out on: $f = n_t - n_l$ on each side. So the first and last value we do include should only count for $1 - f$ each. The trimmed mean will then be: \bar{x}_t = \frac{\left(x_{n_t + 1} + x_{n - n_l + 1}\right)\times\left(1 - f\right) + \sum_{i=n_l+2}^{n - n_l} x_i}{n - 2\times n_t} This is used if trimFrac = "prop" is set.

Alternative, we could take the proportion itself and use linear interpolation on that. The found $n_l$ will be $p_1 = \frac{n_l \times 2}{n}$ of the the total sample size. While if we had rounded up, we had used $p_2 = \frac{\left(n_l + 1\right)\times 2}{n}$ of the the total sample size. Using linear interpolation we then get: \bar{x}_t = \frac{p_t - p_1}{p_2 - p_1}\times\left(\bar{x}_{th}-\bar{x}_{tl}\right) + \bar{x}_{tl} Where $\bar{x}{tl}$ is the trimmed mean if $p_1$ would be used as a trim proportion, and $\bar{x}$ is the trimmed mean if $p_2$ would be used. This is used if trimFrac = "linear" is set.

Winsorized Mean

Similar as with a trimmed mean, but now the data is not removed, but replaced by the value equal to the nearest value that is still included (Winsor as cited in Dixon, 1960, p. 385). W = \frac{n_l \times \left(x_{n_l + 1} + x_{n - n_l}\right) + \sum_{n_l + 1}^{n - n_l} x_i}{n}

Decile Mean

The formula used is (Rana et al., 2012, p. 480) DM = \frac{\sum_{i=1}^k D_i}{9}

Where $D_i$ is the i-th decile score (quantiles with k=10).

Before, After and Alternatives

Before this you might want to create a binned frequency table or a visualisation: * tab_frequency_bins to create a binned frequency table * vi_boxplot_single for a Box (and Whisker) Plot * vi_histogram for a Histogram * vi_stem_and_leaf for a Stem-and-Leaf Display

After this you might want some other descriptive measures: * me_mode_bin for Mode for Binned Data * me_variation for different Measures of Quantitative Variation

Or a perform a test: * ts_student_t_os for One-Sample Student t-Test * ts_trimmed_mean_os for One-Sample Trimmed (Yuen or Yuen-Welch) Mean Test * ts_z_os for One-Sample Z Test

References

Aristotle. (1850). The nicomachean ethics of Aristotle (R. W. Browne, Trans.). Henry G. Bohn.

Dixon, W. J. (1960). Simplified estimation from censored normal samples. The Annals of Mathematical Statistics, 31(2), 385–391. doi:10.1214/aoms/1177705900

Louis, P., Núñez, M., & Xefteris, D. (2023). Trimming extreme reports in preference aggregation. Games and Economic Behavior, 137, 116–151. doi:10.1016/j.geb.2022.11.003

Lovitt, W. V., & Holtzclaw, H. F. (1931). Statistics. Prentice Hall.

Rana, S., Doulah, Md. S., Midi, H., & Imon, A. (2012). Decile mean: A new robust measure of central tendency. Chiang Mai Journal of Science, 39, 478–485.

Tukey, J. W. (1962). The future of data analysis. The Annals of Mathematical Statistics, 33(1), 1–67. doi:10.1214/aoms/1177704711

Weinberg, G. H., & Schumaker, J. A. (1962). Statistics An intuitive approach. Wadsworth Publishing.

Author

Made by P. Stikker

Companion website: https://PeterStatistics.com
YouTube channel: https://www.youtube.com/stikpet
Donations: https://www.patreon.com/bePatron?u=19398076

Examples

Example 1: Numeric Pandas Series

>>> import pandas as pd
>>> df2 = pd.read_csv('https://peterstatistics.com/Packages/ExampleData/StudentStatistics.csv', sep=';', low_memory=False, storage_options={'User-Agent': 'Mozilla/5.0'})
>>> ex1 = df2['Gen_Age']
>>> me_mean(ex1)
np.float64(24.454545454545453)

Example 2: Numeric list

>>> ex2 = [1, 1, 1, 2, 2, 2, 3, 3, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5]
>>> me_mean(ex2)
np.float64(3.4444444444444446)
Expand source code
def me_mean(data, levels=None, version="arithmetic", trimProp=0.1, trimFrac="down"):
    '''
    Mean
    ----
    
    Different types of means can be determined using this function. 
    The mean is a measure of central tendency, to indicate the center.

    This function is shown in this [YouTube video](https://youtu.be/_imyahvt-qE) and the measure is also described at [PeterStatistics.com](https://peterstatistics.com/Terms/Measures/mean.html)
    
    Parameters
    ----------
    data : list or pandas data series 
        numeric data
    levels : dictionary, optional 
        coding to use
    version : {"arithmetic", "winsorized", "trimmed", "windsor", "truncated", "olympic", "geometric", "harmonic", "midrange", "decile"}, optional
            mean to calculate. Default is "arithmetic"
    trimProp : float, optional
        indicate the total proportion to trim. Default at 0.1 i.e. 0.05 from each side.
    trimFrac : {"down", "prop", "linear"}, optional 
        parameter to indicate what to do if trimmed amount is non-integer. Default is "down"
    
    Returns
    -------
    res : float
        value of the mean
    
    Notes
    -----
    
    **Arithmetic Mean**
    
    One of the three Pythagorean means, and the mean most people would assume if you ask them to calculate the mean.
    It is the fulcrum of the distribution (Weinberg & Schumaker, 1962, p.19). One reference can for example be found 
    in Aristotle (384-322 BC) (1850, p. 43). 
    
    The formula:
    $$\\bar{x} = \\frac{\\sum_{i=1}^n x_i}{n}$$
    
    **Harmonic Mean**
    
    The second of the three Pythagorean means:
    $$H = \\frac{n}{\\sum_{i=1}^n \\frac{1}{x_i}}$$
    
    **Geometric Mean**
    
    The third of the three Pythagorean means:
    $$G = e^{\\frac{1}{n}\\times\\sum_{i=1}^n \\ln\\left(x_i\\right)}$$
    
    **Olympic Mean**
    
    Simply ignore the maximum and minimum (only once) (Louis et al., 2023, p. 117):
    $$OM = \\frac{\\sum_{i=2}^{n-1} x_i}{n - 2}$$
    
    **Mid Range**
    
    The average of the maximum and minimum (Lovitt & Holtzclaw, 1931, p. 91):
    $$MR = \\frac{\\min x + \\max x}{2}$$
    
    **Trimmed**
    
    With a trimmed (Windsor/Truncated) mean we trim a fixed amount of scores from each side (Tukey, 1962, p. 17).
    Let $p_t$ be the proportion to trim, we then need to trim $n_t = \\frac{p_t\\times n}{2}$ 
    from each side.
    
    If this $n_t$ is an integer there isn't a problem, but if it isn't we have options. The 
    first option is to simply round down, i.e. $n_l = \\lfloor n_t\\rfloor$. The trimmed mean is then:
    
    $$\\bar{x}_t = \\frac{\\sum_{i=n_t+1}^{n - n_l + 1} x_i}{n - 2\\times n_l}$$
    This is used if *trimFrac = "down"* is set.
    
    We could also use linear interpolation based on the number of scores to trim. We missed out on:
    $f = n_t - n_l$ on each side. So the first and last value we do include should 
    only count for $1 - f$ each. The trimmed mean will then be:
    $$\\bar{x}_t = \\frac{\\left(x_{n_t + 1} + x_{n - n_l + 1}\\right)\\times\\left(1 - f\\right) + \\sum_{i=n_l+2}^{n - n_l} x_i}{n - 2\\times n_t}$$
    This is used if *trimFrac = "prop"* is set.

    Alternative, we could take the proportion itself and use linear interpolation on that. The found $n_l$ 
    will be $p_1 = \\frac{n_l \\times 2}{n}$ of the the total sample size. While if we had rounded up, we had 
    used $p_2 = \\frac{\\left(n_l + 1\\right)\\times 2}{n}$ of the the total sample size. Using linear interpolation we 
    then get:
    $$\\bar{x}_t = \\frac{p_t - p_1}{p_2 - p_1}\\times\\left(\\bar{x}_{th}-\\bar{x}_{tl}\\right) + \\bar{x}_{tl}$$
    Where $\\bar{x}_{tl}$ is the trimmed mean if $p_1$ would be used as a trim proportion, and $\\bar{x}_{th}$ is the 
    trimmed mean if $p_2$ would be used.
    This is used if *trimFrac = "linear"* is set.
    
    **Winsorized Mean**
    
    Similar as with a trimmed mean, but now the data is not removed, but replaced by the value equal to the nearest 
    value that is still included (Winsor as cited in Dixon, 1960, p. 385).
    $$W = \\frac{n_l \\times \\left(x_{n_l + 1} + x_{n - n_l}\\right) + \\sum_{n_l + 1}^{n - n_l} x_i}{n}$$
    
    **Decile Mean**
    
    The formula used is (Rana et al., 2012, p. 480)
    $$DM = \\frac{\\sum_{i=1}^k D_i}{9}$$
    
    Where $D_i$ is the i-th decile score (quantiles with k=10).
    
    Before, After and Alternatives
    ------------------------------
    Before this you might want to create a binned frequency table or a visualisation:
    * [tab_frequency_bins](../other/table_frequency_bins.html#tab_frequency_bins) to create a binned frequency table
    * [vi_boxplot_single](../visualisations/vis_boxplot_single.html#vi_boxplot_single) for a Box (and Whisker) Plot
    * [vi_histogram](../visualisations/vis_histogram.html#vi_histogram) for a Histogram
    * [vi_stem_and_leaf](../visualisations/vis_stem_and_leaf.html#vi_stem_and_leaf) for a Stem-and-Leaf Display

    After this you might want some other descriptive measures:
    * [me_mode_bin](../measures/meas_mode_bin.html#me_mode_bin) for Mode for Binned Data
    * [me_variation](../measures/meas_variation.html#me_variation) for different Measures of Quantitative Variation
    
    Or a perform a test:
    * [ts_student_t_os](../tests/test_student_t_os.html#ts_student_t_os) for One-Sample Student t-Test
    * [ts_trimmed_mean_os](../tests/test_trimmed_mean_os.html#ts_trimmed_mean_os) for One-Sample Trimmed (Yuen or Yuen-Welch) Mean Test
    * [ts_z_os](../tests/test_z_os.html#ts_z_os) for One-Sample Z Test
    
    
    References
    ----------
    Aristotle. (1850). *The nicomachean ethics of Aristotle* (R. W. Browne, Trans.). Henry G. Bohn.
    
    Dixon, W. J. (1960). Simplified estimation from censored normal samples. *The Annals of Mathematical Statistics, 31*(2), 385–391. doi:10.1214/aoms/1177705900
    
    Louis, P., Núñez, M., & Xefteris, D. (2023). Trimming extreme reports in preference aggregation. *Games and Economic Behavior, 137*, 116–151. doi:10.1016/j.geb.2022.11.003
    
    Lovitt, W. V., & Holtzclaw, H. F. (1931). *Statistics*. Prentice Hall.
    
    Rana, S., Doulah, Md. S., Midi, H., & Imon, A. (2012). Decile mean: A new robust measure of central tendency. *Chiang Mai Journal of Science, 39*, 478–485.

    Tukey, J. W. (1962). The future of data analysis. *The Annals of Mathematical Statistics, 33*(1), 1–67. doi:10.1214/aoms/1177704711
    
    Weinberg, G. H., & Schumaker, J. A. (1962). *Statistics An intuitive approach*. Wadsworth Publishing.
    
    Author
    ------
    Made by P. Stikker
    
    Companion website: https://PeterStatistics.com  
    YouTube channel: https://www.youtube.com/stikpet  
    Donations: https://www.patreon.com/bePatron?u=19398076
    
    Examples
    --------
    Example 1: Numeric Pandas Series
    >>> import pandas as pd
    >>> df2 = pd.read_csv('https://peterstatistics.com/Packages/ExampleData/StudentStatistics.csv', sep=';', low_memory=False, storage_options={'User-Agent': 'Mozilla/5.0'})
    >>> ex1 = df2['Gen_Age']
    >>> me_mean(ex1)
    np.float64(24.454545454545453)
    
    Example 2: Numeric list
    >>> ex2 = [1, 1, 1, 2, 2, 2, 3, 3, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5]
    >>> me_mean(ex2)
    np.float64(3.4444444444444446)

    '''
    if type(data) is list:
        data = pd.Series(data)
        
    data = data.dropna()
    if levels is not None:
        dataN = data.replace(levels)
        dataN = pd.to_numeric(dataN)
    else:
        dataN = pd.to_numeric(data)
    
    dataN = dataN.sort_values().reset_index(drop=True)    
    n = len(dataN)
        
    if version=="arithmetic":
        res = dataN.mean()
    elif version in ["winsorized", "trimmed", "windsor", "truncated"]:
        dataN = pd.to_numeric(dataN)
        dataN = dataN.sort_values()
        dataN = dataN.reset_index(drop=True)
        n = len(dataN)
        nt1 = n*trimProp/2
        nl = math.floor(nt1)
        
        if version=="winsorized":
            dataN[0:nl] = dataN[nl]
            dataN[n-nl:n] = dataN[n-nl-1]
            res = dataN.mean()
        
        else:
            if trimFrac=="down":
                res = dataN[nl:(n-nl)].mean()
            elif trimFrac=="prop":
                fr = nt1 - nl
                res = (dataN[nl]*(1 - fr) + dataN[n-nl-1]*(1 - fr) + sum(dataN[nl+1:(n-nl-1)]))/(n - nt1*2)
            elif trimFrac=="linear":
                p1 = nl*2/n
                p2 = (nl + 1)*2/n
                m1 = dataN[nl:(n-nl)].mean()
                m2 = dataN[(nl+1):(n-nl-1)].mean()
                res = (trimProp - p1)/(p2 - p1)*(m2 - m1)+m1
    
    elif version=="olympic":
        res = (dataN.sum() - max(dataN) - min(dataN))/(len(dataN)-2)
        
    elif version=="geometric":
        n = len(dataN)
        res = math.exp(sum(log(dataN))/n)
        
    elif version=="harmonic":
        n = len(dataN)
        res = n/sum(1/dataN)
        
    elif version=="midrange":
        res = (max(dataN) + min(dataN))/2
        
    elif version=="decile":
        deci = me_quantiles(dataN, k=10)
        dm = 0
        for i in range(1, 10):
            dm = dm + deci[i]
        dm = dm/9
        res = dm
    
    return res