The question about interpretation of data arose today. I was with colleagues and the discussion focused on measures of central tendency and dispersion or variability. These are important terms and concepts that form the foundation for any data analysis. It is important to know how they are used.
MEASURES OF CENTRAL TENDENCY
There are five measures of central tendency. Those measures of numbers that reflect the tendency of data to cluster around the center of the group. Two of these (geometric mean and harmonic mean) won’t be discussed here as they are not typically used in the work Extension does. The three I’m talking about are:
- Mean Symbolized by (read bar X) NOTE: The Bar X refers to the arithmetic mean of a SAMPLE (see last week’s blog entry).
- Median Symbolized by Md (read M subscript d)
- Mode Symbolized by Mo (read M subscript o)
MEASURES OF VARIABILITY
There are four measure of variability, three of which I want to mention today. The fourth, known as the Mean (average) deviation, is seldom used in Extension work. They are:
- Range Symbolized by R
- Variance Symbolized by V
- Standard deviation Symbolized by s or SD (for sample) and σ, the lower case Greek letter sigma (for standard deviation of a population).
In this example, the blue distribution (distribution A) has a larger range than the red distribution (Distribution B).
Variance is more technical. It is the sum of squares of the deviations (difference from the mean) about the mean minus 1. Subtracting one removes the bias from the calculation and that allows for a more conservative estimate and being more conservative reduces possible error.
There is a mathematical formula for computing the variance. Fortunately, a computer software program like SPSS or SAS will do it for you.
The standard deviation results when the square root is taken of the variance. It gives us an indication of “…how much each score in a set of scores, on average, varies from the mean” (Salkind, 2004, p. 41). Again, there is a mathematical formula that is computed by a software package. Most people are familiar with the mean and standard deviation of IQ scores: mean=100 and sd = plus or minus 20.
Convention has it that the lower case Greek letters are used for parameters of populations and Roman letters to represent corresponding estimates of samples. So you would see σ for standard deviation (lower case sigma) and μ for mean (lower case mu) for populations and s (or sd for standard deviation) and for samples.
These statistics relate to the measurement scale you have chosen to use. Permissible statistics for a nominal scale are frequency and mode; for ordinal scale, median and percentiles; for an interval scale, mean, variance, standard deviation, and Pearson correlation; and for a ratio scale, the geometric mean. So think seriously about reporting a mean for your Likert-type scale. What exactly does that tell you?