Module stikpetP.measures.meas_consensus
Expand source code
import pandas as pd
import numpy as np
def me_consensus(data, levels=None):
'''
Consensus
---------
The Consensus is a measure of agreement or dispersion for ordinal data. If there is no agreement the value is 0, and with full agreement 1.
This function is shown in this [YouTube video](https://youtu.be/mmTfL5B0p7A) and the measure is also described at [PeterStatistics.com](https://peterstatistics.com/Terms/Measures/Consensus.html)
Parameters
----------
data : list or pandas series
levels : dictionary, optional
with coding to use
Returns
-------
Cns : the consensus score
Notes
-----
The formula used (Tastle et al., 2005, p. 98):
$$\\text{Cns}\\left(X\\right) = 1 + \\sum_{i=1}^k p_i \\log_2\\left(1 - \\frac{\\left|i - \\mu_X\\right|}{d_X}\\right)$$
With:
$$\\mu_X = \\frac{\\sum_{i=1}^k i\\times F_i}{n}$$
$$d_X = k - 1$$
$$p_i = \\frac{F_i}{n}$$
*Symbols used:*
* $F_i$ the frequency (count) of the i-th category (after they have been sorted)
* $n$ the sample size
* $k$ the number of categories.
Before, After and Alternatives
------------------------------
Before this measure you might want an impression using a frequency table or a visualisation:
* [tab_frequency](../other/table_frequency.html#tab_frequency) for a frequency table
* [vi_bar_stacked_single](../visualisations/vis_bar_stacked_single.html#vi_bar_stacked_single) for Single Stacked Bar-Chart
* [vi_bar_dual_axis](../visualisations/vis_bar_dual_axis.html#vi_bar_dual_axis) for Dual-Axis Bar Chart
After this you might want some other descriptive measures:
* [me_hodges_lehmann_os](../measures/meas_hodges_lehmann_os.html#me_hodges_lehmann_os) for the Hodges-Lehmann Estimate (One-Sample)
* [me_median](../measures/meas_median.html#me_median) for the Median
* [me_quantiles](../measures/meas_quantiles.html#me_quantiles) for Quantiles
* [me_quartiles](../measures/meas_quartiles.html#me_quantiles) for Quartiles / Hinges
* [me_quartile_range](../measures/meas_quartile_range.html#me_quartile_range) for Interquartile Range, Semi-Interquartile Range and Mid-Quartile Range
or perform a test:
* [ts_sign_os](../tests/test_sign_os.html#ts_sign_os) for One-Sample Sign Test
* [ts_trinomial_os](../tests/test_trinomial_os.html#ts_trinomial_os) for One-Sample Trinomial Test
* [ts_wilcoxon_os](../tests/test_wilcoxon_os.html#ts_wilcoxon_os) for Wilcoxon Signed Rank Test (One-Sample)
References
----------
Tastle, W. J., & Wierman, M. J. (2007). Consensus and dissention: A measure of ordinal dispersion. *International Journal of Approximate Reasoning, 45*(3), 531–545. doi:10.1016/j.ijar.2006.06.024
Author
------
Made by P. Stikker
Companion website: https://PeterStatistics.com
YouTube channel: https://www.youtube.com/stikpet
Donations: https://www.patreon.com/bePatron?u=19398076
Examples
--------
Example 1: Text Pandas Series
>>> import pandas as pd
>>> student_df = pd.read_csv('https://peterstatistics.com/Packages/ExampleData/StudentStatistics.csv', sep=';', low_memory=False, storage_options={'User-Agent': 'Mozilla/5.0'})
>>> ex1 = student_df['Teach_Motivate']
>>> order = {"Fully Disagree":1, "Disagree":2, "Neither disagree nor agree":3, "Agree":4, "Fully agree":5}
>>> me_consensus(ex1, levels=order)
np.float64(0.42896860013343563)
Example 2: Numeric data
>>> ex2 = [1, 1, 1, 2, 2, 2, 3, 3, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5]
>>> me_consensus(ex2)
np.float64(0.3340394927779964)
Example 3: Text data
>>> ex3 = ["a", "b", "f", "d", "e", "c"]
>>> order = {"a":1, "b":2, "c":3, "d":4, "e":5, "f":6}
>>> me_consensus(ex3, levels=order)
np.float64(0.4444745779083972)
'''
if type(data) is list:
data = pd.Series(data)
data = data.dropna()
if levels is not None:
pd.set_option('future.no_silent_downcasting', True)
dataN = data.map(levels).astype('Int8')
else:
dataN = pd.to_numeric(data)
dataN = dataN.sort_values()
F = list(dataN.value_counts().sort_index())
k = len(F)
n = sum(F)
P = [i/n for i in F]
mu_x = sum([i*F[i-1] for i in range(1, k + 1)]) / n
d_x = k - 1
Cns = 1 + sum([P[i]*np.log2(1 - abs(i + 1 - mu_x)/d_x) for i in range(k)])
return Cns
Functions
def me_consensus(data, levels=None)
-
Consensus
The Consensus is a measure of agreement or dispersion for ordinal data. If there is no agreement the value is 0, and with full agreement 1.
This function is shown in this YouTube video and the measure is also described at PeterStatistics.com
Parameters
data
:list
orpandas series
levels
:dictionary
, optional- with coding to use
Returns
Cns
:the consensus score
Notes
The formula used (Tastle et al., 2005, p. 98): \text{Cns}\left(X\right) = 1 + \sum_{i=1}^k p_i \log_2\left(1 - \frac{\left|i - \mu_X\right|}{d_X}\right)
With: \mu_X = \frac{\sum_{i=1}^k i\times F_i}{n} d_X = k - 1 p_i = \frac{F_i}{n}
Symbols used:
- $F_i$ the frequency (count) of the i-th category (after they have been sorted)
- $n$ the sample size
- $k$ the number of categories.
Before, After and Alternatives
Before this measure you might want an impression using a frequency table or a visualisation: * tab_frequency for a frequency table * vi_bar_stacked_single for Single Stacked Bar-Chart * vi_bar_dual_axis for Dual-Axis Bar Chart
After this you might want some other descriptive measures: * me_hodges_lehmann_os for the Hodges-Lehmann Estimate (One-Sample) * me_median for the Median * me_quantiles for Quantiles * me_quartiles for Quartiles / Hinges * me_quartile_range for Interquartile Range, Semi-Interquartile Range and Mid-Quartile Range
or perform a test: * ts_sign_os for One-Sample Sign Test * ts_trinomial_os for One-Sample Trinomial Test * ts_wilcoxon_os for Wilcoxon Signed Rank Test (One-Sample)
References
Tastle, W. J., & Wierman, M. J. (2007). Consensus and dissention: A measure of ordinal dispersion. International Journal of Approximate Reasoning, 45(3), 531–545. doi:10.1016/j.ijar.2006.06.024
Author
Made by P. Stikker
Companion website: https://PeterStatistics.com
YouTube channel: https://www.youtube.com/stikpet
Donations: https://www.patreon.com/bePatron?u=19398076Examples
Example 1: Text Pandas Series
>>> import pandas as pd >>> student_df = pd.read_csv('https://peterstatistics.com/Packages/ExampleData/StudentStatistics.csv', sep=';', low_memory=False, storage_options={'User-Agent': 'Mozilla/5.0'}) >>> ex1 = student_df['Teach_Motivate'] >>> order = {"Fully Disagree":1, "Disagree":2, "Neither disagree nor agree":3, "Agree":4, "Fully agree":5} >>> me_consensus(ex1, levels=order) np.float64(0.42896860013343563)
Example 2: Numeric data
>>> ex2 = [1, 1, 1, 2, 2, 2, 3, 3, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5] >>> me_consensus(ex2) np.float64(0.3340394927779964)
Example 3: Text data
>>> ex3 = ["a", "b", "f", "d", "e", "c"] >>> order = {"a":1, "b":2, "c":3, "d":4, "e":5, "f":6} >>> me_consensus(ex3, levels=order) np.float64(0.4444745779083972)
Expand source code
def me_consensus(data, levels=None): ''' Consensus --------- The Consensus is a measure of agreement or dispersion for ordinal data. If there is no agreement the value is 0, and with full agreement 1. This function is shown in this [YouTube video](https://youtu.be/mmTfL5B0p7A) and the measure is also described at [PeterStatistics.com](https://peterstatistics.com/Terms/Measures/Consensus.html) Parameters ---------- data : list or pandas series levels : dictionary, optional with coding to use Returns ------- Cns : the consensus score Notes ----- The formula used (Tastle et al., 2005, p. 98): $$\\text{Cns}\\left(X\\right) = 1 + \\sum_{i=1}^k p_i \\log_2\\left(1 - \\frac{\\left|i - \\mu_X\\right|}{d_X}\\right)$$ With: $$\\mu_X = \\frac{\\sum_{i=1}^k i\\times F_i}{n}$$ $$d_X = k - 1$$ $$p_i = \\frac{F_i}{n}$$ *Symbols used:* * $F_i$ the frequency (count) of the i-th category (after they have been sorted) * $n$ the sample size * $k$ the number of categories. Before, After and Alternatives ------------------------------ Before this measure you might want an impression using a frequency table or a visualisation: * [tab_frequency](../other/table_frequency.html#tab_frequency) for a frequency table * [vi_bar_stacked_single](../visualisations/vis_bar_stacked_single.html#vi_bar_stacked_single) for Single Stacked Bar-Chart * [vi_bar_dual_axis](../visualisations/vis_bar_dual_axis.html#vi_bar_dual_axis) for Dual-Axis Bar Chart After this you might want some other descriptive measures: * [me_hodges_lehmann_os](../measures/meas_hodges_lehmann_os.html#me_hodges_lehmann_os) for the Hodges-Lehmann Estimate (One-Sample) * [me_median](../measures/meas_median.html#me_median) for the Median * [me_quantiles](../measures/meas_quantiles.html#me_quantiles) for Quantiles * [me_quartiles](../measures/meas_quartiles.html#me_quantiles) for Quartiles / Hinges * [me_quartile_range](../measures/meas_quartile_range.html#me_quartile_range) for Interquartile Range, Semi-Interquartile Range and Mid-Quartile Range or perform a test: * [ts_sign_os](../tests/test_sign_os.html#ts_sign_os) for One-Sample Sign Test * [ts_trinomial_os](../tests/test_trinomial_os.html#ts_trinomial_os) for One-Sample Trinomial Test * [ts_wilcoxon_os](../tests/test_wilcoxon_os.html#ts_wilcoxon_os) for Wilcoxon Signed Rank Test (One-Sample) References ---------- Tastle, W. J., & Wierman, M. J. (2007). Consensus and dissention: A measure of ordinal dispersion. *International Journal of Approximate Reasoning, 45*(3), 531–545. doi:10.1016/j.ijar.2006.06.024 Author ------ Made by P. Stikker Companion website: https://PeterStatistics.com YouTube channel: https://www.youtube.com/stikpet Donations: https://www.patreon.com/bePatron?u=19398076 Examples -------- Example 1: Text Pandas Series >>> import pandas as pd >>> student_df = pd.read_csv('https://peterstatistics.com/Packages/ExampleData/StudentStatistics.csv', sep=';', low_memory=False, storage_options={'User-Agent': 'Mozilla/5.0'}) >>> ex1 = student_df['Teach_Motivate'] >>> order = {"Fully Disagree":1, "Disagree":2, "Neither disagree nor agree":3, "Agree":4, "Fully agree":5} >>> me_consensus(ex1, levels=order) np.float64(0.42896860013343563) Example 2: Numeric data >>> ex2 = [1, 1, 1, 2, 2, 2, 3, 3, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5] >>> me_consensus(ex2) np.float64(0.3340394927779964) Example 3: Text data >>> ex3 = ["a", "b", "f", "d", "e", "c"] >>> order = {"a":1, "b":2, "c":3, "d":4, "e":5, "f":6} >>> me_consensus(ex3, levels=order) np.float64(0.4444745779083972) ''' if type(data) is list: data = pd.Series(data) data = data.dropna() if levels is not None: pd.set_option('future.no_silent_downcasting', True) dataN = data.map(levels).astype('Int8') else: dataN = pd.to_numeric(data) dataN = dataN.sort_values() F = list(dataN.value_counts().sort_index()) k = len(F) n = sum(F) P = [i/n for i in F] mu_x = sum([i*F[i-1] for i in range(1, k + 1)]) / n d_x = k - 1 Cns = 1 + sum([P[i]*np.log2(1 - abs(i + 1 - mu_x)/d_x) for i in range(k)]) return Cns