Module stikpetP.visualisations.vis_boxplot_single

Expand source code
import pandas as pd
import matplotlib.pyplot as plt


def vi_boxplot_single(data, varname=None):
    '''
    Box (and Whisker) Plot
    ----------------------
    
    A box plot is a little more complex visualisation than a histogram. It shows the five quartiles (e.g. minimum, 1st quartile, median, 3rd quartile, and maximum). It can also be adjusted to show so-called outliers.

    This function is shown in this [YouTube video](https://youtu.be/a0Vu6kL_WYo) and the visualisation is described at [PeterStatistics.com](https://peterstatistics.com/Terms/Visualisations/boxPlot.html)
    
    Parameters
    ----------
    data : list or pandas series 
        the numeric data
    varname : string, optional
        name to display on vertical axis
    
    Notes
    -----
    This was actually a 'range chart' (Spear, 1952, p. 166) but somehow it is these days referred to as a box-and-whisker plot as named by Tukey (1977, p. 39)
    
    The function uses the **boxplot()** function from the *pandas* library. If you want to modify more things (like colors etc.) you might want to use that function.

    Before, After and Alternatives
    ------------------------------
    Before this you might want to create a binned frequency table with [tab_frequency_bins](../other/table_frequency_bins.html#tab_frequency_bins).

    After this you might want some descriptive measures. Use [me_mode_bin](../measures/meas_mode_bin.html#me_mode_bin) for Mode for Binned Data, [me_mean](../measures/meas_mean.html#me_mean) for different types of mean, and/or [me_variation](../measures/meas_variation.html#me_variation) for different Measures of Quantitative Variation
    
    Or a perform a test. Various options include [ts_student_t_os](../tests/test_student_t_os.html#ts_student_t_os) for One-Sample Student t-Test, [ts_trimmed_mean_os](../tests/test_trimmed_mean_os.html#ts_trimmed_mean_os) for One-Sample Trimmed (Yuen or Yuen-Welch) Mean Test, or [ts_z_os](../tests/test_z_os.html#ts_z_os) for One-Sample Z Test.

    Alternative Visualisations are [vi_boxplot_single](../visualisations/vis_boxplot_single.html#vi_boxplot_single) for a Box (and Whisker) Plot and [vi_histogram](../visualisations/vis_histogram.html#vi_histogram) for a Histogram
    
    References
    ----------
    Spear, M. E. (1952). *Charting statistics*. McGraw-Hill.
    
    Tukey, J. W. (1977). *Exploratory data analysis*. Addison-Wesley Pub. Co.
    
    Author
    ------
    Made by P. Stikker
    
    Companion website: https://PeterStatistics.com  
    YouTube channel: https://www.youtube.com/stikpet  
    Donations: https://www.patreon.com/bePatron?u=19398076
    
    Examples
    ---------    
    Example 1: pandas series
    >>> df2 = pd.read_csv('https://peterstatistics.com/Packages/ExampleData/StudentStatistics.csv', sep=';', low_memory=False, storage_options={'User-Agent': 'Mozilla/5.0'})
    >>> ex1 = df2['Gen_Age']
    >>> vi_boxplot_single(ex1);
    
    Example 2: Numeric list
    >>> ex2 = [1, 1, 1, 2, 2, 2, 3, 3, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5]
    >>> vi_boxplot_single(ex2);
    
    '''
    
    if type(data) == list:
        data = pd.Series(data)
        
    data = data.dropna()
    data = pd.to_numeric(data)
    
    plt.boxplot(data.dropna(), orientation='horizontal')
    plt.ylabel(varname)
    plt.yticks([1], " ")
    plt.show()
    
    return

Functions

def vi_boxplot_single(data, varname=None)

Box (and Whisker) Plot

A box plot is a little more complex visualisation than a histogram. It shows the five quartiles (e.g. minimum, 1st quartile, median, 3rd quartile, and maximum). It can also be adjusted to show so-called outliers.

This function is shown in this YouTube video and the visualisation is described at PeterStatistics.com

Parameters

data : list or pandas series
the numeric data
varname : string, optional
name to display on vertical axis

Notes

This was actually a 'range chart' (Spear, 1952, p. 166) but somehow it is these days referred to as a box-and-whisker plot as named by Tukey (1977, p. 39)

The function uses the boxplot() function from the pandas library. If you want to modify more things (like colors etc.) you might want to use that function.

Before, After and Alternatives

Before this you might want to create a binned frequency table with tab_frequency_bins.

After this you might want some descriptive measures. Use me_mode_bin for Mode for Binned Data, me_mean for different types of mean, and/or me_variation for different Measures of Quantitative Variation

Or a perform a test. Various options include ts_student_t_os for One-Sample Student t-Test, ts_trimmed_mean_os for One-Sample Trimmed (Yuen or Yuen-Welch) Mean Test, or ts_z_os for One-Sample Z Test.

Alternative Visualisations are vi_boxplot_single for a Box (and Whisker) Plot and vi_histogram for a Histogram

References

Spear, M. E. (1952). Charting statistics. McGraw-Hill.

Tukey, J. W. (1977). Exploratory data analysis. Addison-Wesley Pub. Co.

Author

Made by P. Stikker

Companion website: https://PeterStatistics.com
YouTube channel: https://www.youtube.com/stikpet
Donations: https://www.patreon.com/bePatron?u=19398076

Examples

Example 1: pandas series

>>> df2 = pd.read_csv('https://peterstatistics.com/Packages/ExampleData/StudentStatistics.csv', sep=';', low_memory=False, storage_options={'User-Agent': 'Mozilla/5.0'})
>>> ex1 = df2['Gen_Age']
>>> vi_boxplot_single(ex1);

Example 2: Numeric list

>>> ex2 = [1, 1, 1, 2, 2, 2, 3, 3, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5]
>>> vi_boxplot_single(ex2);
Expand source code
def vi_boxplot_single(data, varname=None):
    '''
    Box (and Whisker) Plot
    ----------------------
    
    A box plot is a little more complex visualisation than a histogram. It shows the five quartiles (e.g. minimum, 1st quartile, median, 3rd quartile, and maximum). It can also be adjusted to show so-called outliers.

    This function is shown in this [YouTube video](https://youtu.be/a0Vu6kL_WYo) and the visualisation is described at [PeterStatistics.com](https://peterstatistics.com/Terms/Visualisations/boxPlot.html)
    
    Parameters
    ----------
    data : list or pandas series 
        the numeric data
    varname : string, optional
        name to display on vertical axis
    
    Notes
    -----
    This was actually a 'range chart' (Spear, 1952, p. 166) but somehow it is these days referred to as a box-and-whisker plot as named by Tukey (1977, p. 39)
    
    The function uses the **boxplot()** function from the *pandas* library. If you want to modify more things (like colors etc.) you might want to use that function.

    Before, After and Alternatives
    ------------------------------
    Before this you might want to create a binned frequency table with [tab_frequency_bins](../other/table_frequency_bins.html#tab_frequency_bins).

    After this you might want some descriptive measures. Use [me_mode_bin](../measures/meas_mode_bin.html#me_mode_bin) for Mode for Binned Data, [me_mean](../measures/meas_mean.html#me_mean) for different types of mean, and/or [me_variation](../measures/meas_variation.html#me_variation) for different Measures of Quantitative Variation
    
    Or a perform a test. Various options include [ts_student_t_os](../tests/test_student_t_os.html#ts_student_t_os) for One-Sample Student t-Test, [ts_trimmed_mean_os](../tests/test_trimmed_mean_os.html#ts_trimmed_mean_os) for One-Sample Trimmed (Yuen or Yuen-Welch) Mean Test, or [ts_z_os](../tests/test_z_os.html#ts_z_os) for One-Sample Z Test.

    Alternative Visualisations are [vi_boxplot_single](../visualisations/vis_boxplot_single.html#vi_boxplot_single) for a Box (and Whisker) Plot and [vi_histogram](../visualisations/vis_histogram.html#vi_histogram) for a Histogram
    
    References
    ----------
    Spear, M. E. (1952). *Charting statistics*. McGraw-Hill.
    
    Tukey, J. W. (1977). *Exploratory data analysis*. Addison-Wesley Pub. Co.
    
    Author
    ------
    Made by P. Stikker
    
    Companion website: https://PeterStatistics.com  
    YouTube channel: https://www.youtube.com/stikpet  
    Donations: https://www.patreon.com/bePatron?u=19398076
    
    Examples
    ---------    
    Example 1: pandas series
    >>> df2 = pd.read_csv('https://peterstatistics.com/Packages/ExampleData/StudentStatistics.csv', sep=';', low_memory=False, storage_options={'User-Agent': 'Mozilla/5.0'})
    >>> ex1 = df2['Gen_Age']
    >>> vi_boxplot_single(ex1);
    
    Example 2: Numeric list
    >>> ex2 = [1, 1, 1, 2, 2, 2, 3, 3, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5]
    >>> vi_boxplot_single(ex2);
    
    '''
    
    if type(data) == list:
        data = pd.Series(data)
        
    data = data.dropna()
    data = pd.to_numeric(data)
    
    plt.boxplot(data.dropna(), orientation='horizontal')
    plt.ylabel(varname)
    plt.yticks([1], " ")
    plt.show()
    
    return