2.1. Nominal variable (absolute- and relative frequencies)

(this site uses frames, if you do not see the weblecture and definitions frames on the right you can click here, if you don't see a menu on the left and want to go to the home page click here)

The first thing most often done to get an overview of all the scores is to count how many scores there are of each value. Let’s say we have the variable EdProg (short for educational program), with the following values and codes: 1 = Business, 2 = IT, 3 = Mathematics, 4 = Psychology. We collect the scores on this of 10 students and obtain the results as shown in Table 7.

Table 7
Ten example scores
Student A B C D E F G H I J
EdProg 1 2 2 3 1 2 1 4 1 3

This already isn’t very clear and imagine if we would have listed 100 students instead of 10. Since there are only four options simply counting how many there are for each value might give a better overview, as shown in Table 8.

Table 8
Example of a Frequency table
Educational Program Number of scores
Business 4
IT 3
Mathematics 2
Psychology 1

We can quickly see that most students do Business and only one did Psychology. These ‘number of scores’ are called absolute frequencies in statistics, or often abbreviated to simply frequency.

definition 20: (absolute) Frequency
(absolute) Frequency
“the number of occurrences of a particular phenomenon” (Zedeck, “Frequency”, 2014, p. 144).

To make comparisons easier, the absolute frequency is often compared to the total frequency. In Table 9 the calculations are shown for the example.

Table 9
Calculation example of relative frequency
Educational program (absolute) Frequency Fraction
Business 4 4 / 10 = 0.4
IT 3 3 / 10 = 0.3
Mathematics 2 2 / 10 = 0.2
Psychology 1 1 / 10 = 0.1
TOTAL 10 10 / 10 = 1.0

These fractions are called relative frequencies.

definition 21: Relative Frequency
Relative Frequency
"[absolute frequency] expressed as a fraction of the total frequency" (Kenney & Keeping, 1954, p. 17).

Note that since the relative frequency is the absolute frequency divided by the total, you can reverse the process and determine that the absolute frequency is the relative frequency multiplied by the total.

The relative frequency is often expressed as a percentage. Percentage comes from the combination of ‘per’ which is a fancy name for ‘by’ and ‘cent’ which is a fancy name for 100. A percentage is in essence thus ‘by 100’. It shows how many cases you could expect with a certain value if there would have been 100 cases (instead of for example the 10 in the example).

The percentage can be calculated by multiplying the relative frequency with 100, as shown in Table 10.

Table 10
Example calculation of percentages
Educational Program (absolute) Frequency Relative Frequency Percent
Business 4 0.4 0.4 x 100 = 40%
IT 3 0.3 0.3 x 100 = 30%
Mathematics 2 0.2 0.2 x 100 = 20%
Psychology 1 0.1 0.1 x 100 = 10%
TOTAL 10 1.0 1.0 x 100 = 100%

The concise dictionary of mathematics, defines percentages as shown below.

definition 22: Percentage
Percentage
“a way of expressing ratios in terms of whole numbers. A ratio or fraction is converted to a percentage by multiplying by 100 and appending a "percentage sign" %” (Weisstein, 2002, p. 2200).

If you are given percentages you can convert these easily to relative frequencies by dividing by 100 (e.g. 40% = 40/100 = 0.40) and if needed you can convert these then back to absolute frequencies, provided the total number of cases is given.

An alternative method would be to notice in the example that if 10 = 100%, then dividing by 100 gives 0.1 = 1%. If we then want to know the absolute frequency corresponding to 40% we simply multiply both sides by 40, to get 40 x 0.1 = 40 x 1%, which results in 4 = 40%.

In the example used the relative frequency and the percent all nicely add up to resp. 1 and 100%, but due to rounding errors this might not always be the case. Even if it does not add up nicely the total should always be reported as 1.0 for relative frequencies or 100% for percentages.

Another issue is the amount of decimals. In the example this was somewhat avoided since all values only had one decimal, but in real life this will most often not be the case. There are different opinions on this and the choice also sometimes depends on the accuracy of the measurement itself, the style guide of whomever you write for etc. A nice article that compares a few different options was written by Cole (2015) in case you are interested. I’d suggest to use two decimal places for relative frequencies and then no decimals for percentages, but please note that in some cases more decimals might be required or desired.

With the relative frequencies a problem arises if not everyone has answered the question. Perhaps in the example survey there were five students who did not answer this question. Should the relative frequency be based on the total including, or excluding these five students? Since each of these could be of interest, we need another term: valid. If we use as a total the number of cases excluding the missing values, we call it valid relative frequency (or valid percent, if it’s in percentages). The calculation of relative and valid relative are shown in Table 11.

Table 11
Example calculation of valid relative frequency
Educational Program (absolute) Frequency Relative Frequency Valid Relative Frequency
Business 4 4 / 15 ≈ 0.27 4 / 10 = 0.4
IT 3 3 / 15 = 0.20 3 / 10 = 0.3
Mathematics 2 2 / 15 ≈ 0.13 2 / 10 = 0.2
Psychology 1 1 / 15 ≈ 0.07 1 / 10 = 0.1
SUBTOTAL 10 10 / 15 ≈ 0.67 10 / 10 = 1.0
No answer given 5 5 / 15 ≈ 0.33  
TOTAL 15 15 / 15 = 1.0  

Most often the reported relative frequency is actually the valid relative frequency and the term ‘valid’ is not mentioned. In this course we will also use this convention and when we mention relative frequency actually mean valid relative frequency, unless specifically stated otherwise. The same goes for percent and valid percent.

definition 23: Valid (relative frequency/percent)
Valid (relative frequency/percent)
using as a total the number of cases excluding any missing values for that variable

The (absolute) frequency, (valid) relative frequency (in percentages) are the two types of frequencies you can use for a table to get an overview of the scores of a nominal variable. These frequencies can also be added for any other measurement level. In the next segments we will see which other types of frequencies can be possible for those, but not for a nominal variable.

<<next segment: 2.2. Table for an ordinal variable>>