Module stikpetP.visualisations.vis_histogram

Expand source code
import pandas as pd
import matplotlib.pyplot as plt

def vi_histogram(data, xlbl=None, ylbl=None, **kwargs):
    '''
    Histogram
    ---------
    
    A histogram is a bit like a bar chart for a scale variable. You would create some bins, and then plot these as bars.

    This function is shown in this [YouTube video](https://youtu.be/P3UbUQ4deRI) and the visualisation is described at [PeterStatistics.com](https://peterstatistics.com/Terms/Visualisations/histogram.html)
    
    Parameters
    ----------
    data : list or pandas series 
        the numeric scores
    xlbl : string, optional 
        label for the horizontal axis
    ylbl : string, optional 
        label for the vertical axis
    kwargs : other parameters for use in pyplot hist function
    
    Notes
    -----
    To set the bins, the *bins* argument can be used. This could be a pre-set number based on a calculation, a specific rule (e.g. bins="sturges"), or a list with the cut-off points.
    
    If your bins are of equal width, a true histogram than actually should show frequency densities (Pearson, 1895, p. 399). These are the frequencies divided by the bin-width. This can be done using *density=True* parameter.

    Before, After and Alternatives
    ------------------------------
    Before this you might want to create a binned frequency table with [tab_frequency_bins](../other/table_frequency_bins.html#tab_frequency_bins).

    After this you might want some descriptive measures. Use [me_mode_bin](../measures/meas_mode_bin.html#me_mode_bin) for Mode for Binned Data, [me_mean](../measures/meas_mean.html#me_mean) for different types of mean, and/or [me_variation](../measures/meas_variation.html#me_variation) for different Measures of Quantitative Variation
    
    Or a perform a test. Various options include [ts_student_t_os](../tests/test_student_t_os.html#ts_student_t_os) for One-Sample Student t-Test, [ts_trimmed_mean_os](../tests/test_trimmed_mean_os.html#ts_trimmed_mean_os) for One-Sample Trimmed (Yuen or Yuen-Welch) Mean Test, or [ts_z_os](../tests/test_z_os.html#ts_z_os) for One-Sample Z Test.

    Alternative Visualisations are [vi_boxplot_single](../visualisations/vis_boxplot_single.html#vi_boxplot_single) for a Box (and Whisker) Plot and [vi_histogram](../visualisations/vis_histogram.html#vi_histogram) for a Histogram
    
    References
    ----------
    Pearson, K. (1895). Contributions to the mathematical theory of evolution. II. Skew variation in homogeneous material. *Philosophical Transactions of the Royal Society of London. (A.)*, 186, 343–414. doi:10.1098/rsta.1895.0010
    
    Author
    ------
    Made by P. Stikker
    
    Companion website: https://PeterStatistics.com  
    YouTube channel: https://www.youtube.com/stikpet  
    Donations: https://www.patreon.com/bePatron?u=19398076
    
    Examples
    ---------    
    Example 1: pandas series
    >>> df2 = pd.read_csv('https://peterstatistics.com/Packages/ExampleData/StudentStatistics.csv', sep=';', low_memory=False, storage_options={'User-Agent': 'Mozilla/5.0'})
    >>> ex1 = df2['Gen_Age']
    >>> vi_histogram(ex1);
    >>> vi_histogram(ex1, density=True);
    
    Example 2: Numeric list
    >>> ex2 = [1, 1, 1, 2, 2, 2, 3, 3, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5]
    >>> vi_histogram(ex2);
    
    '''
    
    plt.hist(data, **kwargs)
    plt.xlabel(xlbl)
    plt.ylabel(ylbl)
    plt.show()
    
    return

Functions

def vi_histogram(data, xlbl=None, ylbl=None, **kwargs)

Histogram

A histogram is a bit like a bar chart for a scale variable. You would create some bins, and then plot these as bars.

This function is shown in this YouTube video and the visualisation is described at PeterStatistics.com

Parameters

data : list or pandas series
the numeric scores
xlbl : string, optional
label for the horizontal axis
ylbl : string, optional
label for the vertical axis
kwargs : other parameters for use in pyplot hist function
 

Notes

To set the bins, the bins argument can be used. This could be a pre-set number based on a calculation, a specific rule (e.g. bins="sturges"), or a list with the cut-off points.

If your bins are of equal width, a true histogram than actually should show frequency densities (Pearson, 1895, p. 399). These are the frequencies divided by the bin-width. This can be done using density=True parameter.

Before, After and Alternatives

Before this you might want to create a binned frequency table with tab_frequency_bins.

After this you might want some descriptive measures. Use me_mode_bin for Mode for Binned Data, me_mean for different types of mean, and/or me_variation for different Measures of Quantitative Variation

Or a perform a test. Various options include ts_student_t_os for One-Sample Student t-Test, ts_trimmed_mean_os for One-Sample Trimmed (Yuen or Yuen-Welch) Mean Test, or ts_z_os for One-Sample Z Test.

Alternative Visualisations are vi_boxplot_single for a Box (and Whisker) Plot and vi_histogram for a Histogram

References

Pearson, K. (1895). Contributions to the mathematical theory of evolution. II. Skew variation in homogeneous material. Philosophical Transactions of the Royal Society of London. (A.), 186, 343–414. doi:10.1098/rsta.1895.0010

Author

Made by P. Stikker

Companion website: https://PeterStatistics.com
YouTube channel: https://www.youtube.com/stikpet
Donations: https://www.patreon.com/bePatron?u=19398076

Examples

Example 1: pandas series

>>> df2 = pd.read_csv('https://peterstatistics.com/Packages/ExampleData/StudentStatistics.csv', sep=';', low_memory=False, storage_options={'User-Agent': 'Mozilla/5.0'})
>>> ex1 = df2['Gen_Age']
>>> vi_histogram(ex1);
>>> vi_histogram(ex1, density=True);

Example 2: Numeric list

>>> ex2 = [1, 1, 1, 2, 2, 2, 3, 3, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5]
>>> vi_histogram(ex2);
Expand source code
def vi_histogram(data, xlbl=None, ylbl=None, **kwargs):
    '''
    Histogram
    ---------
    
    A histogram is a bit like a bar chart for a scale variable. You would create some bins, and then plot these as bars.

    This function is shown in this [YouTube video](https://youtu.be/P3UbUQ4deRI) and the visualisation is described at [PeterStatistics.com](https://peterstatistics.com/Terms/Visualisations/histogram.html)
    
    Parameters
    ----------
    data : list or pandas series 
        the numeric scores
    xlbl : string, optional 
        label for the horizontal axis
    ylbl : string, optional 
        label for the vertical axis
    kwargs : other parameters for use in pyplot hist function
    
    Notes
    -----
    To set the bins, the *bins* argument can be used. This could be a pre-set number based on a calculation, a specific rule (e.g. bins="sturges"), or a list with the cut-off points.
    
    If your bins are of equal width, a true histogram than actually should show frequency densities (Pearson, 1895, p. 399). These are the frequencies divided by the bin-width. This can be done using *density=True* parameter.

    Before, After and Alternatives
    ------------------------------
    Before this you might want to create a binned frequency table with [tab_frequency_bins](../other/table_frequency_bins.html#tab_frequency_bins).

    After this you might want some descriptive measures. Use [me_mode_bin](../measures/meas_mode_bin.html#me_mode_bin) for Mode for Binned Data, [me_mean](../measures/meas_mean.html#me_mean) for different types of mean, and/or [me_variation](../measures/meas_variation.html#me_variation) for different Measures of Quantitative Variation
    
    Or a perform a test. Various options include [ts_student_t_os](../tests/test_student_t_os.html#ts_student_t_os) for One-Sample Student t-Test, [ts_trimmed_mean_os](../tests/test_trimmed_mean_os.html#ts_trimmed_mean_os) for One-Sample Trimmed (Yuen or Yuen-Welch) Mean Test, or [ts_z_os](../tests/test_z_os.html#ts_z_os) for One-Sample Z Test.

    Alternative Visualisations are [vi_boxplot_single](../visualisations/vis_boxplot_single.html#vi_boxplot_single) for a Box (and Whisker) Plot and [vi_histogram](../visualisations/vis_histogram.html#vi_histogram) for a Histogram
    
    References
    ----------
    Pearson, K. (1895). Contributions to the mathematical theory of evolution. II. Skew variation in homogeneous material. *Philosophical Transactions of the Royal Society of London. (A.)*, 186, 343–414. doi:10.1098/rsta.1895.0010
    
    Author
    ------
    Made by P. Stikker
    
    Companion website: https://PeterStatistics.com  
    YouTube channel: https://www.youtube.com/stikpet  
    Donations: https://www.patreon.com/bePatron?u=19398076
    
    Examples
    ---------    
    Example 1: pandas series
    >>> df2 = pd.read_csv('https://peterstatistics.com/Packages/ExampleData/StudentStatistics.csv', sep=';', low_memory=False, storage_options={'User-Agent': 'Mozilla/5.0'})
    >>> ex1 = df2['Gen_Age']
    >>> vi_histogram(ex1);
    >>> vi_histogram(ex1, density=True);
    
    Example 2: Numeric list
    >>> ex2 = [1, 1, 1, 2, 2, 2, 3, 3, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5]
    >>> vi_histogram(ex2);
    
    '''
    
    plt.hist(data, **kwargs)
    plt.xlabel(xlbl)
    plt.ylabel(ylbl)
    plt.show()
    
    return