Analyzing a single ordinal variable
1c: Center and dispersion for an ordinal variable
Instead of (or additional to) creating a table or a visualisation of the data, some statistical measures can provide a description of the sample data. The two most common types of statistical measures are those for central tendency and those for dispersion.
The center
Measures of central tendency try to establish somewhat of the ‘most typical’ value for the data. For nominal data the mode is the only measure of central tendency that could be used. It is the score (or scores) that occur most often (to learn more about the mode please see here). We could also determine the mode for an ordinal variable, but then we are not taken into account the order of the items. Because of this, a more frequently used (and probably better) measure of central tendency would be the so-called median.
The median is the score at the middle of all scores, or more formally defined “the middle value in a distribution, below and above which lie values with equal total frequencies or probabilities” (Porkess, 1991, p. 134). This means that 50% of the respondents scored equal or higher to the median, and also 50% of the respondents scored lower or equal. If for example at a school exam the results indicate that the median is a 70 (out of 100, with 55 or more being a pass), then we know that at least 50% of the students passed. From a frequency table, the median can quickly be found by looking at the cumulative percentages.
In the example from Table 5 we can see that the cumulative percent passes the 50% mark when it goes from 31.3 to 67.8. So, one of the 348 people that chose ‘Not too scientific’ is the one exactly in the middle. The median is therefore 'not too scientific'.
Click here to see how to determine the median.
with Excel
Excel file from video: CE - Median.xlsm.
with Python
Jupyter Notebook used in video: CE - Median.ipynb.
Data file used in video and notebook GSS2012a.csv.
with R
R script used in video: CE - Median.R.
Data file used in video and notebook GSS2012-Adjusted.sav.
with SPSS
Three different methods are shown below, each will eventually give the same result.
using Frequencies
The video below shows how to obtain the median using the Frequencies option.
Datafile used in video: GSS2012-Adjusted.sav
using Explore
The video below shows how to obtain the median using the Explore option.
Datafile used in video: GSS2012-Adjusted.sav
using a shortcut
The video below shows how to obtain the median using a shortcut.
Datafile used in video: GSS2012-Adjusted.sav
The dispersion
The centre alone does not give a good picture. If your head is in the oven and your feet in a refrigerator you’d be doing fine on average, but the deviation from the average is too high. That’s why besides a measure of centre, you should also report a measure of dispersion.
One such measure of dispersion is called consensus (Tastle & Wierman, 2007; Tastle, Wierman, & Rex Dumdum, 2005). This measure ranges from 0 to 1. A zero would indicate a complete lack of consensus, the number of people that tend towards one end of the ordinal variable (i.e. fully disagree) is then the same as the number of people who tend to the other end (i.e. fully agree), while consensus of one would indicate all respondents gave the same answer.
Although the interpretation of the measure of consensus is relatively straight forward, the calculation is bit tricky.
Click here to see how to determine the measure of consensus.
with Excel
Excel file from video: DI - Consensus.xlsm.
with Python
Jupyter Notebook used in video: DI - Consensus.ipynb.
Data file used in video and notebook GSS2012a.csv.
with R (Studio)
R script used in video: DI - Consensus.R.
Data file used in video and notebook GSS2012-Adjusted.sav.
with SPSS (use Excel)
Unfortunately it is, to my knowledge, not possible with SPSS to determine the measure of consensus. I'd recommend to create a frequency table and then follow the instructions shown in the Excel method.
manually (formula)
The formula for the measure of consensus is:
\(Cns(X) = 1 + \sum_{i=1}^n p_i \text{log}_2(1 - \frac{|X_i-\mu_x|}{d_x})\)
With:
\(\mu_X = \frac{\sum_{i=1}^k X_i\times F_i}{n}\)
\( d_X = \text{max}\left(X_i\right) - \text{min}\left(X_i\right) \)
\( p_i = \frac{F_i}{n}\)
\(n\) is the sample size, \(F_i\) the frequency of the i-th category, and \(k\) the number of categories
In this example the measure of consensus will be approximately 0.5116. Besides the measure of consensus, another measure of dispersion that is often mentioned with ordinal data is the so-called interquartile range. However, with ordinal data that only have a few categories (i.e. Likert scales) I would not recommend it.
Now that we have a decent impression of our sample data, we can move on to see what the sample can tell us about the population in the next section.
Single ordinal variable
Google adds