Measures of central tendency are numbers that indicate the centre of a set of ordered numerical data.
The three common measures of central tendency are the mean, the median and the mode.
The mean gives each element of a data set equal weight. When there are no extreme numbers in the data set (no very low or very high numbers), the mean is a good choice for a measure of central tendency. Statisticians state that "the mean is the most unbiased measure of central tendency".
The median gives the greatest weight to elements in the middle of the ordered data. When there are extreme numbers in the data set (very low or very high numbers), the median is a good choice for a measure of central tendency. The extreme numbers have less effect (or no effect at all) on the median.
The mode is a good choice for a measure of central tendency when the data has many identical data values.
The data below are the hourly sales of songs for an on-line music store over a ten hour period.
RAW DATA: { 11, 10, 13, 15, 73, 69, 67, 66, 14, 12 }
ORDERED DATA: { 10, 11, 12, 13, 14, 15, 66, 67, 69, 73 }
Mean: 35
Median: 14.5
Mode: There is no mode.
Given the way the data is distributed in this example (with many small and many large numbers), the arithmetic mean is probably the most appropriate measure of central tendency.
The mean number of songs sold at an on-line music store over a ten hour period is 35. [Open a demonstration(with the data of this example pre-entered).]
The data below are the yearly wages (in dollars) of ten people working at an on-line music store.
DATA: { 41 000, 41 000, 41 000, 41 000, 43 000, 45 000, 48 000, 50 000, 50 000, 250 000 }
Mean: 65 000
Median: 44 000
Mode: 41 000
Given the way the data is distributed in this example (with one persons yearly wage being so large), the median is probably the best measure of central tendency.
NOTE: Nine people are below the mean and one person is above the mean, so the mean is probably not the most appropriate measure of central tendency.
NOTE: The majority of people working at the store (four in this case) are new employees who earn "starting wages". The mode, therefore, is probably not the most appropriate measure of central tendency.
The median yearly wage of ten people working at an on-line music store is $44 000.00. [Open a demonstration (with the data of this example pre-entered).]
The data below are the seventeen shoe sizes of one type of shoe sold in one day at a local shoe store.
DATA: { 5, 6, 7, 7, 7, 7, 7, 7, 8, 9, 9, 10, 11, 12, 13, 13, 15 }
Mean: 9
Median: 8
Mode: 7
Given the way the data is distributed in this example (with so many size seven shoes being sold), the mode is probably the best measure of central tendency.
The mode shoe size of one type of shoe sold at a local shoe store is size seven. [Open a demonstration (with the data of this example pre-entered).]
What happens to the mean and median if we add or multiply each observation in a data set by a constant?
Consider for example if an instructor curves an exam by adding five points to each student’s score. What effect does this have on the mean and the median? The result of adding a constant to each value has the intended effect of altering the mean and median by the constant.
For example, if in the above example where we have 10 aptitude scores, if 5 was added to each score the mean of this new data set would be 87.1 (the original mean of 82.1 plus 5) and the new median would be 86 (the original median of 81 plus 5).
Similarly, if each observed data value was multiplied by a constant, the new mean and median would change by a factor of this constant. Returning to the 10 aptitude scores, if all of the original scores were doubled, the then the new mean and new median would be double the original mean and median. As we will learn shortly, the effect is not the same on the variance!
Looking Ahead!
Why would you want to know this? One reason, especially for those moving onward to more applied statistics (e.g. Regression, ANOVA), is the transforming data. For many applied statistical methods, a required assumption is that the data is normal, or very near bell-shaped. When the data is not normal, statisticians will transform the data using numerous techniques e.g. logarithmic transformation. We just need to remember the original data was transformed!!
Shape
The shape of the data helps us to determine the most appropriate measure of central tendency. The three most important descriptions of shape are Symmetric, Left-skewed, and Right-skewed. Skewness is a measure of the degree of asymmetry of the distribution.
Symmetric
- mean, median, and mode are all the same here
- no skewness is apparent
- the distribution is described as symmetric
Mean = Median = Mode Symmetrical
Left-Skewed or Skewed Left
- mean < median
- long tail on the left
Median Mean Mode Skewed to the left
Right-skewed or Skewed Right
- mean > median
- long tail on the right
Median Mean Mode Skewed to the right
Note! When one has very skewed data, it is better to use the median as measure of central tendency since the median is not much affected by extreme values.