Frequency Table

A frequency table is defined as "a table showing (1) all of the values for a variable in a dataset, and (2) the frequency of each of those responses. Some frequency tables also show a cumulative frequency and proportions of responses" (Warne, 2017, p. 512). An example is shown in Table 1.

Table 1
*Results of How scientific-Accounting*
		Frequency	Percent	Valid Percent	Cumulative Percent
Valid	very scientific	100	5.1	10.5	10.5
	pretty scientific	199	10.1	20.9	31.3
	not too scientific	348	17.6	36.5	67.8
	not scientific at all	307	15.6	32.2	100
	Subtotal	954	48.3	100.0
Missing	No answer	1020	51.7
	Subtotal	1020	51.7
Total		1974	100.0

Click here to see how to create a frequency table with Excel, Python, R, or SPSS.

with Excel

Excel file from videos available here.

with stikpetE add-in

without add-ins

with Python

Jupyter Notebook of video is available here.

with stikpetP library

without stikpetP library

with R (Studio)

with stikpetP library

Jupyter Notebook of video is available here.

without stikpetR library

R script of video is available here.

Datafile used in video: GSS2012-Adjusted.sav

with SPSS

There are a three different ways to create a frequency table with SPSS.

An SPSS workbook with instructions of the first two can be found here.

using Frequencies

watch the video below, or download the pdf instructions (via bitly, opens in new window/tab).

Datafile used in video: Holiday Fair.sav

using Custom Tables

watch the video below, or download the pdf instructions for versions before 24, or version 24 (via bitly, opens in new window/tab)

Datafile used in video: Holiday Fair.sav

using descriptive shortcut

watch the video below, or download the pdf instructions (via bitly, opens in new window/tab).

Datafile used in video: StudentStatistics.sav

A frequency table can help to get impression of your survey data of a binary, nominal, or ordinal variable. It could also help with a scale variable, provided there are not too many options. If, for example, you have asked for age, a list going from 1 to 90 with different ages and frequencies, will probably not be so helpful.

If you have many options in the scale variable, the data is often binned (e.g. 0 < 10, 10 < 20, etc.), which creates then an ordinal variable, of which a frequency table can then be helpful. See binning for more information on this.

A frequency table can show different types of frequencies. Various options are discussed in the different sections below (click on each to reveal it's content).

(Absolute) Frequency

The column Frequency shows how many respondents answered each option. We can tell that 100 people in this survey chose the option 'very scientific'. This is also known as the absolute frequency and defined as “the number of occurrences of a particular phenomenon” (Zedeck, “Frequency”, 2014, p. 144).

(Valid) Percent and Relative

The Percent column shows the percentages, based on the grand total, so including the missing values. The 5.1 indicates that 5.1% of all respondents chose the 'very scientific' option (you can check that 100 / 1974 x 100 ≈ 5.1).
Percentages can be defined as “a way of expressing ratios in terms of whole numbers. A ratio or fraction is converted to a percentage by multiplying by 100 and appending a "percentage sign" %” (Weisstein, 2002, p. 2200).

The Valid Percent shows the percentage, based on the valid total, so excluding the missing values. The 10.5 indicates that 10.5% of all of those who answered this question chose the 'very scientific' option. Most often the ‘Percent’ shown in reports are actually Valid Percent, but the word ‘Valid’ is then simply left out.

Percentages show the number of cases that could be expected if there would be 100 cases in total, hence per-cent which means 'per 100'. If your sample size is very small, be careful about using percentages. If it is less than 100, it means that you are 'blowing up' your differences, while percentages are more commonly used to 'scale down'.

APA recommends to report percentages with one or no decimals.

The term relative frequency is also sometimes used. This is the frequency divided by the total number of cases. Note that this should then always produce a decimal value between 0 and 1 (inclusive). Multiply this by 100 and you get the percentage, multiply it by 1000 and you get permille (‰), multiply it by 360 and you get the degrees of a circle, etc.

Cumulative (Percent)

The cumulative frequency (not shown in example table) can be defined as: “the total (absolute) frequency up to the upper boundary of that class” (Kenney, 1939, p. 16). This would only be useful if there is an order to the categories, so we can say that for example 299 respondents found accounting pretty scientific or even more. Which is why these cumulative frequencies will not have a meaningful interpretation for a nominal variable (e.g. 28 students study business or less?).

The Cumulative Percent is the running total of the Valid Percent, it is the addition of all previous and the current category’s valid percentages. We can see that 31.3% of the respondents that answered this question though accounting is pretty or very scientific.

Density

click here for a video explanation

When the categories are ranges of values (bins), the frequency density could become helpful. It can be defined as: “the number of occurrences of an event divided by the bin size…” (Zedeck, 2014, pp. 144–145).

In principle it is the frequency divided by the bin size (the upper bound minus the lower bound). It shows how 'dense' that particular category (bin) is. Table 2 shows an example.

Table 2
*Example calculation of frequency density*
Age	Frequency	bin size	Frequency Density
0 < 10	15	10 – 0 = 10	15 / 10 = 1.5
10 < 15	23	15 – 10 = 5	23 / 5 = 4.6
15 < 25	22	25 – 15 = 10	22 / 10 = 2.2
25 < 50	40	50 – 25 = 25	40 / 25 = 1.6
50 < 100	4	100 – 50 = 50	5 / 50 = 0.1

Note that if all the bins are the same size, there is not much point in determining the frequency density, since you'll be dividing each frequency by the same value.

Instead of dividing each frequency by the bin size, you can also set a standard bin width, and divide by how many times the bin size fits that standard.

As for the relative frequency density, two variations with the same results can be used. The first is by dividing the frequency density by the total (Haighton, Haworth, & Wake, 2003, p. 74), the second would be to divide the relative frequency by the bin size (Kozak, Kozak, Staudhammer, & Watts, 2008, p. 80).

The binning itself is often done with a scale variable, since the frequency table would otherwise often be too long to give a good overview. See binning for more information on how to actually create bins from a scale variable.

Cumulative frequency densities are not often used and even argued to be pointless to calculate (Petry & Friesen, 2012)

If you have open ended bins (e.g. ‘below 20’, ‘65+’) you cannot determine the bin size, and therefore also not the frequency density.

Obtaining the Frequency Density

with Excel

Excel file from video: IM - Frequency Density (E).xlsm

using stikpetE

without using stikpetE

with Python

Notebook from video: IM - Frequency Density (P).ipynb

using stikpetP

without using stikpetP

with R

Notebook from video: IM - Frequency Density (R).ipynb

using stikpetR

without using stikpetR

with SPSS (somewhat)

Formula

The formula for the Frequency Density is:

\(FD_i = \frac{F_i}{CW_i}\)

With:

\(CW_i = UB_i - LB_i\)

\(F_i\) is absolute frequency of category \(i\), \(CW_i\) the class-width, and \(LB_i\) the lower bound, and \(UB_i\) the upper bound.

The relative frequency density can be obtained using:

\(RFD_i = \frac{FD_i}{n} = \frac{RF_i}{CW_i}\)

With:

\(RF_i = \frac{F_i}{n}\)

\(n\) is the sample size, i.e. \(n = \sum_{i = 1}^k F_i\), where \(k\) is the number of categories. \(RF_i\) is the relative frequency of category \(i\).

Tables

Frequency Table

Cross Table

Multiple Response Set

Paired Ordinals

Miscelanious

Binning

Google adds