Module `stikpetP.measures.meas_consensus`

Expand source code

import pandas as pd
import numpy as np

def me_consensus(data, levels=None):
    '''
    Consensus
    ---------
    
    The Consensus is a measure of agreement or dispersion for ordinal data. If there is no agreement the value is 0, and with full agreement 1.

    This function is shown in this [YouTube video](https://youtu.be/mmTfL5B0p7A) and the measure is also described at [PeterStatistics.com](https://peterstatistics.com/Terms/Measures/Consensus.html)
    
    Parameters
    ----------
    data : list or pandas series 
    levels : dictionary, optional
        with coding to use
    
    Returns
    -------
    Cns : the consensus score
    
    Notes
    -----
    The formula used (Tastle et al., 2005, p. 98):
    $$\\text{Cns}\\left(X\\right) = 1 + \\sum_{i=1}^k p_i \\log_2\\left(1 - \\frac{\\left|i - \\mu_X\\right|}{d_X}\\right)$$
    
    With:
    $$\\mu_X = \\frac{\\sum_{i=1}^k i\\times F_i}{n}$$
    $$d_X = k - 1$$
    $$p_i = \\frac{F_i}{n}$$
    
    *Symbols used:*
    
    * $F_i$ the frequency (count) of the i-th category (after they have been sorted)
    * $n$ the sample size
    * $k$ the number of categories.

    Before, After and Alternatives
    ------------------------------
    Before this measure you might want an impression using a frequency table or a visualisation:
    * [tab_frequency](../other/table_frequency.html#tab_frequency) for a frequency table
    * [vi_bar_stacked_single](../visualisations/vis_bar_stacked_single.html#vi_bar_stacked_single) for Single Stacked Bar-Chart
    * [vi_bar_dual_axis](../visualisations/vis_bar_dual_axis.html#vi_bar_dual_axis) for Dual-Axis Bar Chart

    After this you might want some other descriptive measures:
    * [me_hodges_lehmann_os](../measures/meas_hodges_lehmann_os.html#me_hodges_lehmann_os) for the Hodges-Lehmann Estimate (One-Sample)
    * [me_median](../measures/meas_median.html#me_median) for the Median
    * [me_quantiles](../measures/meas_quantiles.html#me_quantiles) for Quantiles
    * [me_quartiles](../measures/meas_quartiles.html#me_quantiles) for Quartiles / Hinges
    * [me_quartile_range](../measures/meas_quartile_range.html#me_quartile_range) for Interquartile Range, Semi-Interquartile Range and Mid-Quartile Range
    
    or perform a test:
    * [ts_sign_os](../tests/test_sign_os.html#ts_sign_os) for One-Sample Sign Test
    * [ts_trinomial_os](../tests/test_trinomial_os.html#ts_trinomial_os) for One-Sample Trinomial Test
    * [ts_wilcoxon_os](../tests/test_wilcoxon_os.html#ts_wilcoxon_os) for Wilcoxon Signed Rank Test (One-Sample)
     
    
    References 
    ----------
    Tastle, W. J., & Wierman, M. J. (2007). Consensus and dissention: A measure of ordinal dispersion. *International Journal of Approximate Reasoning, 45*(3), 531–545. doi:10.1016/j.ijar.2006.06.024
    
    Author
    ------
    Made by P. Stikker
    
    Companion website: https://PeterStatistics.com  
    YouTube channel: https://www.youtube.com/stikpet  
    Donations: https://www.patreon.com/bePatron?u=19398076

    Examples
    --------
    Example 1: Text Pandas Series
    >>> import pandas as pd
    >>> student_df = pd.read_csv('https://peterstatistics.com/Packages/ExampleData/StudentStatistics.csv', sep=';', low_memory=False, storage_options={'User-Agent': 'Mozilla/5.0'})
    >>> ex1 = student_df['Teach_Motivate']
    >>> order = {"Fully Disagree":1, "Disagree":2, "Neither disagree nor agree":3, "Agree":4, "Fully agree":5}
    >>> me_consensus(ex1, levels=order)
    np.float64(0.42896860013343563)
    
    Example 2: Numeric data
    >>> ex2 = [1, 1, 1, 2, 2, 2, 3, 3, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5]
    >>> me_consensus(ex2)
    np.float64(0.3340394927779964)
    
    Example 3: Text data
    >>> ex3 = ["a", "b", "f", "d", "e", "c"]
    >>> order = {"a":1, "b":2, "c":3, "d":4, "e":5, "f":6}
    >>> me_consensus(ex3, levels=order)
    np.float64(0.4444745779083972)
    
    '''
    
    if type(data) is list:
        data = pd.Series(data)
        
    data = data.dropna()
    if levels is not None:
        pd.set_option('future.no_silent_downcasting', True)
        dataN = data.map(levels).astype('Int8')
    else:
        dataN = pd.to_numeric(data)
    
    dataN = dataN.sort_values()
    
    F = list(dataN.value_counts().sort_index())
    k = len(F)
    n = sum(F)
    
    P = [i/n for i in F]

    mu_x = sum([i*F[i-1] for i in range(1, k + 1)]) / n
    d_x = k - 1
    
    Cns = 1 + sum([P[i]*np.log2(1 - abs(i + 1 - mu_x)/d_x)  for i in range(k)])
    
    return Cns

Functions

def me_consensus(data, levels=None)

Consensus

The Consensus is a measure of agreement or dispersion for ordinal data. If there is no agreement the value is 0, and with full agreement 1.

This function is shown in this YouTube video and the measure is also described at PeterStatistics.com

Parameters

data : list or pandas series
levels : dictionary, optional: with coding to use

Returns

Cns : the consensus score

Notes

The formula used (Tastle et al., 2005, p. 98): $\text{Cns}\left(X\right) = 1 + \sum_{i=1}^k p_i \log_2\left(1 - \frac{\left|i - \mu_X\right|}{d_X}\right)$

With: $\mu_X = \frac{\sum_{i=1}^k i\times F_i}{n}$ $d_X = k - 1$ $p_i = \frac{F_i}{n}$

Symbols used:

$F_i$ the frequency (count) of the i-th category (after they have been sorted)
$n$ the sample size
$k$ the number of categories.

Before, After and Alternatives

Before this measure you might want an impression using a frequency table or a visualisation: * tab_frequency for a frequency table * vi_bar_stacked_single for Single Stacked Bar-Chart * vi_bar_dual_axis for Dual-Axis Bar Chart

After this you might want some other descriptive measures: * me_hodges_lehmann_os for the Hodges-Lehmann Estimate (One-Sample) * me_median for the Median * me_quantiles for Quantiles * me_quartiles for Quartiles / Hinges * me_quartile_range for Interquartile Range, Semi-Interquartile Range and Mid-Quartile Range

or perform a test: * ts_sign_os for One-Sample Sign Test * ts_trinomial_os for One-Sample Trinomial Test * ts_wilcoxon_os for Wilcoxon Signed Rank Test (One-Sample)

References

Tastle, W. J., & Wierman, M. J. (2007). Consensus and dissention: A measure of ordinal dispersion. International Journal of Approximate Reasoning, 45(3), 531–545. doi:10.1016/j.ijar.2006.06.024

Author

Made by P. Stikker

Companion website: https://PeterStatistics.com
YouTube channel: https://www.youtube.com/stikpet
Donations: https://www.patreon.com/bePatron?u=19398076

Examples

Example 1: Text Pandas Series

>>> import pandas as pd
>>> student_df = pd.read_csv('https://peterstatistics.com/Packages/ExampleData/StudentStatistics.csv', sep=';', low_memory=False, storage_options={'User-Agent': 'Mozilla/5.0'})
>>> ex1 = student_df['Teach_Motivate']
>>> order = {"Fully Disagree":1, "Disagree":2, "Neither disagree nor agree":3, "Agree":4, "Fully agree":5}
>>> me_consensus(ex1, levels=order)
np.float64(0.42896860013343563)

Example 2: Numeric data

>>> ex2 = [1, 1, 1, 2, 2, 2, 3, 3, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5]
>>> me_consensus(ex2)
np.float64(0.3340394927779964)

Example 3: Text data

>>> ex3 = ["a", "b", "f", "d", "e", "c"]
>>> order = {"a":1, "b":2, "c":3, "d":4, "e":5, "f":6}
>>> me_consensus(ex3, levels=order)
np.float64(0.4444745779083972)

Expand source code

def me_consensus(data, levels=None):
    '''
    Consensus
    ---------
    
    The Consensus is a measure of agreement or dispersion for ordinal data. If there is no agreement the value is 0, and with full agreement 1.

    This function is shown in this [YouTube video](https://youtu.be/mmTfL5B0p7A) and the measure is also described at [PeterStatistics.com](https://peterstatistics.com/Terms/Measures/Consensus.html)
    
    Parameters
    ----------
    data : list or pandas series 
    levels : dictionary, optional
        with coding to use
    
    Returns
    -------
    Cns : the consensus score
    
    Notes
    -----
    The formula used (Tastle et al., 2005, p. 98):
    $$\\text{Cns}\\left(X\\right) = 1 + \\sum_{i=1}^k p_i \\log_2\\left(1 - \\frac{\\left|i - \\mu_X\\right|}{d_X}\\right)$$
    
    With:
    $$\\mu_X = \\frac{\\sum_{i=1}^k i\\times F_i}{n}$$
    $$d_X = k - 1$$
    $$p_i = \\frac{F_i}{n}$$
    
    *Symbols used:*
    
    * $F_i$ the frequency (count) of the i-th category (after they have been sorted)
    * $n$ the sample size
    * $k$ the number of categories.

    Before, After and Alternatives
    ------------------------------
    Before this measure you might want an impression using a frequency table or a visualisation:
    * [tab_frequency](../other/table_frequency.html#tab_frequency) for a frequency table
    * [vi_bar_stacked_single](../visualisations/vis_bar_stacked_single.html#vi_bar_stacked_single) for Single Stacked Bar-Chart
    * [vi_bar_dual_axis](../visualisations/vis_bar_dual_axis.html#vi_bar_dual_axis) for Dual-Axis Bar Chart

    After this you might want some other descriptive measures:
    * [me_hodges_lehmann_os](../measures/meas_hodges_lehmann_os.html#me_hodges_lehmann_os) for the Hodges-Lehmann Estimate (One-Sample)
    * [me_median](../measures/meas_median.html#me_median) for the Median
    * [me_quantiles](../measures/meas_quantiles.html#me_quantiles) for Quantiles
    * [me_quartiles](../measures/meas_quartiles.html#me_quantiles) for Quartiles / Hinges
    * [me_quartile_range](../measures/meas_quartile_range.html#me_quartile_range) for Interquartile Range, Semi-Interquartile Range and Mid-Quartile Range
    
    or perform a test:
    * [ts_sign_os](../tests/test_sign_os.html#ts_sign_os) for One-Sample Sign Test
    * [ts_trinomial_os](../tests/test_trinomial_os.html#ts_trinomial_os) for One-Sample Trinomial Test
    * [ts_wilcoxon_os](../tests/test_wilcoxon_os.html#ts_wilcoxon_os) for Wilcoxon Signed Rank Test (One-Sample)
     
    
    References 
    ----------
    Tastle, W. J., & Wierman, M. J. (2007). Consensus and dissention: A measure of ordinal dispersion. *International Journal of Approximate Reasoning, 45*(3), 531–545. doi:10.1016/j.ijar.2006.06.024
    
    Author
    ------
    Made by P. Stikker
    
    Companion website: https://PeterStatistics.com  
    YouTube channel: https://www.youtube.com/stikpet  
    Donations: https://www.patreon.com/bePatron?u=19398076

    Examples
    --------
    Example 1: Text Pandas Series
    >>> import pandas as pd
    >>> student_df = pd.read_csv('https://peterstatistics.com/Packages/ExampleData/StudentStatistics.csv', sep=';', low_memory=False, storage_options={'User-Agent': 'Mozilla/5.0'})
    >>> ex1 = student_df['Teach_Motivate']
    >>> order = {"Fully Disagree":1, "Disagree":2, "Neither disagree nor agree":3, "Agree":4, "Fully agree":5}
    >>> me_consensus(ex1, levels=order)
    np.float64(0.42896860013343563)
    
    Example 2: Numeric data
    >>> ex2 = [1, 1, 1, 2, 2, 2, 3, 3, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5]
    >>> me_consensus(ex2)
    np.float64(0.3340394927779964)
    
    Example 3: Text data
    >>> ex3 = ["a", "b", "f", "d", "e", "c"]
    >>> order = {"a":1, "b":2, "c":3, "d":4, "e":5, "f":6}
    >>> me_consensus(ex3, levels=order)
    np.float64(0.4444745779083972)
    
    '''
    
    if type(data) is list:
        data = pd.Series(data)
        
    data = data.dropna()
    if levels is not None:
        pd.set_option('future.no_silent_downcasting', True)
        dataN = data.map(levels).astype('Int8')
    else:
        dataN = pd.to_numeric(data)
    
    dataN = dataN.sort_values()
    
    F = list(dataN.value_counts().sort_index())
    k = len(F)
    n = sum(F)
    
    P = [i/n for i in F]

    mu_x = sum([i*F[i-1] for i in range(1, k + 1)]) / n
    d_x = k - 1
    
    Cns = 1 + sum([P[i]*np.log2(1 - abs(i + 1 - mu_x)/d_x)  for i in range(k)])
    
    return Cns