Analysing two scale variables
Part 1: Impression of sample data
To begin analysing two scale variables, we might begin by comparing some statistical measurements. Usually I'd recommend to start with a frequency table, which can still be useful, but if the range of the scale variable is very big it might not be very insightful.
The statistical measures of interest with a scale variable are usually the average (strickly speaking called the arithmetic mean) as a measure to indicate the center, and to indicate a bit about the variation the standard deviation is often reported. For example a company would like to know if there is a relation between the current salaries, and the beginning salaries. Table 1 shows the descriptive measurements.
Click here to see how to determine the mean and standard deviation with SPSS, with R, or with Excel.
with SPSS
There are a four different ways to determine the mean and standard deviation with SPSS.
using Frequencies
using Descriptives
using Explore
using a shortcut
with R
with Excel
From the table, we can see that the average (mean) current salary is almost double the beginning salary. The ‘Std. Deviation’ is short for standard deviation and tells us something about the variation. It shows roughly how much each score was above or below the mean.
A high standard deviation indicates a high variety in scores, which could indicate that people disagreed with each other, or that something is very unstable. In the example in the variation for the current salary is much higher than the variation for the beginning salary. However, the range for the current salary is also a lot higher.
To compare two standard deviations, the coefficient of variation is often calculated. This is simply the standard deviation, divided by the mean. For the current salary, this will be 17075.661 / 34419.57 = 0.50, and for the beginning salary it is 7870.638 / 17016.09 = 0.46. Relatively the variation seems to be therefor similar, but in absolute terms it is higher for the current salary.
This gave a first impression of the two variables, but we might be more interested to know if there might be a relation between them. To do this we can start by visualising the sample data on the next page.
Google adds