Module stikpetP.visualisations.vis_histogram
Expand source code
import pandas as pd
import matplotlib.pyplot as plt
def vi_histogram(data, xlbl=None, ylbl=None, **kwargs):
'''
Histogram
---------
A histogram is a bit like a bar chart for a scale variable. You would create some bins, and then plot these as bars.
This function is shown in this [YouTube video](https://youtu.be/P3UbUQ4deRI) and the visualisation is described at [PeterStatistics.com](https://peterstatistics.com/Terms/Visualisations/histogram.html)
Parameters
----------
data : list or pandas series
the numeric scores
xlbl : string, optional
label for the horizontal axis
ylbl : string, optional
label for the vertical axis
kwargs : other parameters for use in pyplot hist function
Notes
-----
To set the bins, the *bins* argument can be used. This could be a pre-set number based on a calculation, a specific rule (e.g. bins="sturges"), or a list with the cut-off points.
If your bins are of equal width, a true histogram than actually should show frequency densities (Pearson, 1895, p. 399). These are the frequencies divided by the bin-width. This can be done using *density=True* parameter.
Before, After and Alternatives
------------------------------
Before this you might want to create a binned frequency table with [tab_frequency_bins](../other/table_frequency_bins.html#tab_frequency_bins).
After this you might want some descriptive measures. Use [me_mode_bin](../measures/meas_mode_bin.html#me_mode_bin) for Mode for Binned Data, [me_mean](../measures/meas_mean.html#me_mean) for different types of mean, and/or [me_variation](../measures/meas_variation.html#me_variation) for different Measures of Quantitative Variation
Or a perform a test. Various options include [ts_student_t_os](../tests/test_student_t_os.html#ts_student_t_os) for One-Sample Student t-Test, [ts_trimmed_mean_os](../tests/test_trimmed_mean_os.html#ts_trimmed_mean_os) for One-Sample Trimmed (Yuen or Yuen-Welch) Mean Test, or [ts_z_os](../tests/test_z_os.html#ts_z_os) for One-Sample Z Test.
Alternative Visualisations are [vi_boxplot_single](../visualisations/vis_boxplot_single.html#vi_boxplot_single) for a Box (and Whisker) Plot and [vi_histogram](../visualisations/vis_histogram.html#vi_histogram) for a Histogram
References
----------
Pearson, K. (1895). Contributions to the mathematical theory of evolution. II. Skew variation in homogeneous material. *Philosophical Transactions of the Royal Society of London. (A.)*, 186, 343–414. doi:10.1098/rsta.1895.0010
Author
------
Made by P. Stikker
Companion website: https://PeterStatistics.com
YouTube channel: https://www.youtube.com/stikpet
Donations: https://www.patreon.com/bePatron?u=19398076
Examples
---------
Example 1: pandas series
>>> df2 = pd.read_csv('https://peterstatistics.com/Packages/ExampleData/StudentStatistics.csv', sep=';', low_memory=False, storage_options={'User-Agent': 'Mozilla/5.0'})
>>> ex1 = df2['Gen_Age']
>>> vi_histogram(ex1);
>>> vi_histogram(ex1, density=True);
Example 2: Numeric list
>>> ex2 = [1, 1, 1, 2, 2, 2, 3, 3, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5]
>>> vi_histogram(ex2);
'''
plt.hist(data, **kwargs)
plt.xlabel(xlbl)
plt.ylabel(ylbl)
plt.show()
return
Functions
def vi_histogram(data, xlbl=None, ylbl=None, **kwargs)-
Histogram
A histogram is a bit like a bar chart for a scale variable. You would create some bins, and then plot these as bars.
This function is shown in this YouTube video and the visualisation is described at PeterStatistics.com
Parameters
data:listorpandas series- the numeric scores
xlbl:string, optional- label for the horizontal axis
ylbl:string, optional- label for the vertical axis
kwargs:other parameters for use in pyplot hist function
Notes
To set the bins, the bins argument can be used. This could be a pre-set number based on a calculation, a specific rule (e.g. bins="sturges"), or a list with the cut-off points.
If your bins are of equal width, a true histogram than actually should show frequency densities (Pearson, 1895, p. 399). These are the frequencies divided by the bin-width. This can be done using density=True parameter.
Before, After and Alternatives
Before this you might want to create a binned frequency table with tab_frequency_bins.
After this you might want some descriptive measures. Use me_mode_bin for Mode for Binned Data, me_mean for different types of mean, and/or me_variation for different Measures of Quantitative Variation
Or a perform a test. Various options include ts_student_t_os for One-Sample Student t-Test, ts_trimmed_mean_os for One-Sample Trimmed (Yuen or Yuen-Welch) Mean Test, or ts_z_os for One-Sample Z Test.
Alternative Visualisations are vi_boxplot_single for a Box (and Whisker) Plot and vi_histogram for a Histogram
References
Pearson, K. (1895). Contributions to the mathematical theory of evolution. II. Skew variation in homogeneous material. Philosophical Transactions of the Royal Society of London. (A.), 186, 343–414. doi:10.1098/rsta.1895.0010
Author
Made by P. Stikker
Companion website: https://PeterStatistics.com
YouTube channel: https://www.youtube.com/stikpet
Donations: https://www.patreon.com/bePatron?u=19398076Examples
Example 1: pandas series
>>> df2 = pd.read_csv('https://peterstatistics.com/Packages/ExampleData/StudentStatistics.csv', sep=';', low_memory=False, storage_options={'User-Agent': 'Mozilla/5.0'}) >>> ex1 = df2['Gen_Age'] >>> vi_histogram(ex1); >>> vi_histogram(ex1, density=True);Example 2: Numeric list
>>> ex2 = [1, 1, 1, 2, 2, 2, 3, 3, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5] >>> vi_histogram(ex2);Expand source code
def vi_histogram(data, xlbl=None, ylbl=None, **kwargs): ''' Histogram --------- A histogram is a bit like a bar chart for a scale variable. You would create some bins, and then plot these as bars. This function is shown in this [YouTube video](https://youtu.be/P3UbUQ4deRI) and the visualisation is described at [PeterStatistics.com](https://peterstatistics.com/Terms/Visualisations/histogram.html) Parameters ---------- data : list or pandas series the numeric scores xlbl : string, optional label for the horizontal axis ylbl : string, optional label for the vertical axis kwargs : other parameters for use in pyplot hist function Notes ----- To set the bins, the *bins* argument can be used. This could be a pre-set number based on a calculation, a specific rule (e.g. bins="sturges"), or a list with the cut-off points. If your bins are of equal width, a true histogram than actually should show frequency densities (Pearson, 1895, p. 399). These are the frequencies divided by the bin-width. This can be done using *density=True* parameter. Before, After and Alternatives ------------------------------ Before this you might want to create a binned frequency table with [tab_frequency_bins](../other/table_frequency_bins.html#tab_frequency_bins). After this you might want some descriptive measures. Use [me_mode_bin](../measures/meas_mode_bin.html#me_mode_bin) for Mode for Binned Data, [me_mean](../measures/meas_mean.html#me_mean) for different types of mean, and/or [me_variation](../measures/meas_variation.html#me_variation) for different Measures of Quantitative Variation Or a perform a test. Various options include [ts_student_t_os](../tests/test_student_t_os.html#ts_student_t_os) for One-Sample Student t-Test, [ts_trimmed_mean_os](../tests/test_trimmed_mean_os.html#ts_trimmed_mean_os) for One-Sample Trimmed (Yuen or Yuen-Welch) Mean Test, or [ts_z_os](../tests/test_z_os.html#ts_z_os) for One-Sample Z Test. Alternative Visualisations are [vi_boxplot_single](../visualisations/vis_boxplot_single.html#vi_boxplot_single) for a Box (and Whisker) Plot and [vi_histogram](../visualisations/vis_histogram.html#vi_histogram) for a Histogram References ---------- Pearson, K. (1895). Contributions to the mathematical theory of evolution. II. Skew variation in homogeneous material. *Philosophical Transactions of the Royal Society of London. (A.)*, 186, 343–414. doi:10.1098/rsta.1895.0010 Author ------ Made by P. Stikker Companion website: https://PeterStatistics.com YouTube channel: https://www.youtube.com/stikpet Donations: https://www.patreon.com/bePatron?u=19398076 Examples --------- Example 1: pandas series >>> df2 = pd.read_csv('https://peterstatistics.com/Packages/ExampleData/StudentStatistics.csv', sep=';', low_memory=False, storage_options={'User-Agent': 'Mozilla/5.0'}) >>> ex1 = df2['Gen_Age'] >>> vi_histogram(ex1); >>> vi_histogram(ex1, density=True); Example 2: Numeric list >>> ex2 = [1, 1, 1, 2, 2, 2, 3, 3, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5] >>> vi_histogram(ex2); ''' plt.hist(data, **kwargs) plt.xlabel(xlbl) plt.ylabel(ylbl) plt.show() return