Module stikpetP.helper.help_quantileIndexing

Expand source code
import pandas as pd
import math

def he_quantileIndexing(data, k=4, method="sas1"):
    '''
    Quantile Indexing
    
    Helper function for **me_quantiles()** and **he_quantileIndexing()** to return the index number of the quantiles.
    
    Parameters
    ----------
    data : pandas series with numeric values
    k : number of quantiles
    method : optional which method to use to calculate quartiles
    
    Returns
    -------
    indexes : the indexes of the quantiles
    
    Notes
    -----
    Six alternatives for the indexing is:
    
    Most basic (**SAS1**):
    $$iQ_i = n\\times p_i$$

    **SAS4** method uses for indexing (SAS, 1990, p. 626; Snedecor, 1940, p. 43):
    $$iQ_i = \\left(n + 1\\right)\\times p_i$$

    **Hog and Ledolter** use for their indexing (Hogg & Ledolter, 1992, p. 21; Hazen, 1914, p. ?):

    $$iQ_i = n\\times p_i + \\frac{1}{2}$$

    **MS Excel** uses for indexing (Gumbel, 1939, p. ?; Hyndman & Fan, 1996, p. 363):
    $$iQ_i = \\left(n - 1\\right)\\times p_i + 1$$

    **Hyndman and Fan** use for their 8th version (Hyndman & Fan, 1996, p. 363):
    $$iQ_i = \\left(n + \\frac{1}{3}\\right)\\times p_i + \\frac{1}{3}$$

    **Hyndman and Fan** use for their 9th version (Hyndman & Fan, 1996, p. 364):
    $$iQ_i = \\left(n + \\frac{1}{4}\\right)\\times p_i + \\frac{3}{8}$$
    
    References
    ----------
    Gumbel, E. J. (1939). La Probabilité des Hypothèses. Compes Rendus de l’ Académie des Sciences, 209, 645–647.
    
    Hazen, A. (1914). Storage to be provided in impounding municipal water supply. Transactions of the American Society of Civil Engineers, 77(1), 1539–1640. https://doi.org/10.1061/taceat.0002563
    
    Hogg, R. V., & Ledolter, J. (1992). Applied statistics for engineers and physical scientists (2nd int.). Macmillan.
    
    Hyndman, R. J., & Fan, Y. (1996). Sample quantiles in statistical packages. The American Statistician, 50(4), 361–365. https://doi.org/10.2307/2684934
   
    SAS. (1990). SAS procedures guide: Version 6 (3rd ed.). SAS Institute.
    
    Snedecor, G. W. (1940). Statistical methods applied to experiments in agriculture and biology (3rd ed.). The Iowa State College Press.
    
    Author
    ------
    Made by P. Stikker
    
    Please visit: https://PeterStatistics.com
    
    YouTube channel: https://www.youtube.com/stikpet
    
    '''
    props = 1/k
    n = len(data)
    
    indexes = pd.Series(dtype='float64')
    for i in range(k+1):
        indexes.at[i] = i*props
    
    if method=="sas1":
        indexes = n*indexes
    elif method=="sas4":
        indexes = (n + 1)*indexes
    elif method=="hl":  
        indexes = n*indexes + 1/2
    elif method=="excel":
        indexes = (n - 1)*indexes + 1
    elif method=="hf8":
        indexes = (n + 1/3)*indexes + 1/3
    elif method=="hf9":
        indexes = (n + 1/4)*indexes + 3/8
    
    #adjust for min and maximum    
    for i in range(k+1):
        if indexes.at[i] < 1:
            indexes.at[i] = 1
        elif indexes.at[i] > n:
            indexes.at[i] = n
            
    return indexes

Functions

def he_quantileIndexing(data, k=4, method='sas1')

Quantile Indexing

Helper function for me_quantiles() and he_quantileIndexing() to return the index number of the quantiles.

Parameters

data : pandas series with numeric values
 
k : number of quantiles
 
method : optional which method to use to calculate quartiles
 

Returns

indexes : the indexes of the quantiles
 

Notes

Six alternatives for the indexing is:

Most basic (SAS1): iQ_i = n\times p_i

SAS4 method uses for indexing (SAS, 1990, p. 626; Snedecor, 1940, p. 43): iQ_i = \left(n + 1\right)\times p_i

Hog and Ledolter use for their indexing (Hogg & Ledolter, 1992, p. 21; Hazen, 1914, p. ?):

iQ_i = n\times p_i + \frac{1}{2}

MS Excel uses for indexing (Gumbel, 1939, p. ?; Hyndman & Fan, 1996, p. 363): iQ_i = \left(n - 1\right)\times p_i + 1

Hyndman and Fan use for their 8th version (Hyndman & Fan, 1996, p. 363): iQ_i = \left(n + \frac{1}{3}\right)\times p_i + \frac{1}{3}

Hyndman and Fan use for their 9th version (Hyndman & Fan, 1996, p. 364): iQ_i = \left(n + \frac{1}{4}\right)\times p_i + \frac{3}{8}

References

Gumbel, E. J. (1939). La Probabilité des Hypothèses. Compes Rendus de l’ Académie des Sciences, 209, 645–647.

Hazen, A. (1914). Storage to be provided in impounding municipal water supply. Transactions of the American Society of Civil Engineers, 77(1), 1539–1640. https://doi.org/10.1061/taceat.0002563

Hogg, R. V., & Ledolter, J. (1992). Applied statistics for engineers and physical scientists (2nd int.). Macmillan.

Hyndman, R. J., & Fan, Y. (1996). Sample quantiles in statistical packages. The American Statistician, 50(4), 361–365. https://doi.org/10.2307/2684934

SAS. (1990). SAS procedures guide: Version 6 (3rd ed.). SAS Institute.

Snedecor, G. W. (1940). Statistical methods applied to experiments in agriculture and biology (3rd ed.). The Iowa State College Press.

Author

Made by P. Stikker

Please visit: https://PeterStatistics.com

YouTube channel: https://www.youtube.com/stikpet

Expand source code
def he_quantileIndexing(data, k=4, method="sas1"):
    '''
    Quantile Indexing
    
    Helper function for **me_quantiles()** and **he_quantileIndexing()** to return the index number of the quantiles.
    
    Parameters
    ----------
    data : pandas series with numeric values
    k : number of quantiles
    method : optional which method to use to calculate quartiles
    
    Returns
    -------
    indexes : the indexes of the quantiles
    
    Notes
    -----
    Six alternatives for the indexing is:
    
    Most basic (**SAS1**):
    $$iQ_i = n\\times p_i$$

    **SAS4** method uses for indexing (SAS, 1990, p. 626; Snedecor, 1940, p. 43):
    $$iQ_i = \\left(n + 1\\right)\\times p_i$$

    **Hog and Ledolter** use for their indexing (Hogg & Ledolter, 1992, p. 21; Hazen, 1914, p. ?):

    $$iQ_i = n\\times p_i + \\frac{1}{2}$$

    **MS Excel** uses for indexing (Gumbel, 1939, p. ?; Hyndman & Fan, 1996, p. 363):
    $$iQ_i = \\left(n - 1\\right)\\times p_i + 1$$

    **Hyndman and Fan** use for their 8th version (Hyndman & Fan, 1996, p. 363):
    $$iQ_i = \\left(n + \\frac{1}{3}\\right)\\times p_i + \\frac{1}{3}$$

    **Hyndman and Fan** use for their 9th version (Hyndman & Fan, 1996, p. 364):
    $$iQ_i = \\left(n + \\frac{1}{4}\\right)\\times p_i + \\frac{3}{8}$$
    
    References
    ----------
    Gumbel, E. J. (1939). La Probabilité des Hypothèses. Compes Rendus de l’ Académie des Sciences, 209, 645–647.
    
    Hazen, A. (1914). Storage to be provided in impounding municipal water supply. Transactions of the American Society of Civil Engineers, 77(1), 1539–1640. https://doi.org/10.1061/taceat.0002563
    
    Hogg, R. V., & Ledolter, J. (1992). Applied statistics for engineers and physical scientists (2nd int.). Macmillan.
    
    Hyndman, R. J., & Fan, Y. (1996). Sample quantiles in statistical packages. The American Statistician, 50(4), 361–365. https://doi.org/10.2307/2684934
   
    SAS. (1990). SAS procedures guide: Version 6 (3rd ed.). SAS Institute.
    
    Snedecor, G. W. (1940). Statistical methods applied to experiments in agriculture and biology (3rd ed.). The Iowa State College Press.
    
    Author
    ------
    Made by P. Stikker
    
    Please visit: https://PeterStatistics.com
    
    YouTube channel: https://www.youtube.com/stikpet
    
    '''
    props = 1/k
    n = len(data)
    
    indexes = pd.Series(dtype='float64')
    for i in range(k+1):
        indexes.at[i] = i*props
    
    if method=="sas1":
        indexes = n*indexes
    elif method=="sas4":
        indexes = (n + 1)*indexes
    elif method=="hl":  
        indexes = n*indexes + 1/2
    elif method=="excel":
        indexes = (n - 1)*indexes + 1
    elif method=="hf8":
        indexes = (n + 1/3)*indexes + 1/3
    elif method=="hf9":
        indexes = (n + 1/4)*indexes + 3/8
    
    #adjust for min and maximum    
    for i in range(k+1):
        if indexes.at[i] < 1:
            indexes.at[i] = 1
        elif indexes.at[i] > n:
            indexes.at[i] = n
            
    return indexes