Introduction To Measures of Central Tendency
There are three main measures of central tendency: the mode, the median, and the mean.
- Mean: Average Value
- Median: Middle Value
- Mode: Most Frequent
The mean is the sum of all the values divided by the number of observations or sample size. It is nothing but the average value.
Example: The mean of the values 5,6,6,8,9,9,9,9,10,10 is (5+6+6+8+9+9+9+9+10+10)/10 = 8.1
Limitation: It is affected by extreme values. Very large or very small numbers can distort the answer.
The median is nothing more than the middle value of your observations when they are ordered from the smallest to the largest. It is the middle value.
Example: It splits the data in half. Half of the data are above the median; half of the data are below the median. 7, 8, 7, 6, 9, 8, 8 → 6, 7, 7, 8, 8, 8, 9 → 8
the Median 7, 8, 7, 6, 9, 8, 8, 7 → 6, 7, 7, 7, 8, 8, 8, 9 → (7 + 8)/2 = 7.5 is the Median
Advantage: It is NOT affected by extreme values. Very large or very small numbers does not affect it.
Mode: It is the value that occurs most frequently. In other words, the mode is the most common outcome. Mode is the name of the category that occurs more often. There is a chance of having more than one 5, 6, 5, 7, 5, 8, 9, 5 → 5 is the Mode 5, 6, 6, 5, 7, 6, 5, 6, 8, 9, 5, 6 → 5 and 6 are mode.
Mode Advantage: It can be used when the data is not numerical.
1. There may be no mode at all if none of the data is the same.
2. There may be more than one mode.
WHEN TO USE WHAT MEASUREMENT OF CENTRAL TENDENCY?
Mean – When your data is not skewed i.e Symmetric/Normally Distributed. In other words, there are no extreme values present in the data set (Outliers).
Median – When your data is skewed or you are dealing with ordinal (ordered categories) data.
Mode – When dealing with nominal (unordered categories) data.
When to use Median instead of Mean
“ If your data is quantitative then go for mean or median.”
Basically, if your data is having some influential outliers or data is highly skewed then the median is the best measurement for finding central tendency. Otherwise go for Mean.
Eg: Salary 15k 18k 16k 14k 15k 15k 12k 17k 90k 95k Mean is 30.1K whereas most workers have salaries in the $12k to 18k range. Hence Median is to be preferred.
When to use Mode
If data is Categorical (Nominal or Ordinal) it is impossible to calculate the mean or median. So, go for mode. Normally, the mode is used for categorical data where we wish to know which is the most common category, as illustrated below:
Check the effect of outliers in Measures of Central Tendency
“Outliers” are values that “lie outside” the other values.
Long Jump A new coach has been working with the Long Jump team this month, and the athletes’ performance has changed.
Bob: + 0.12m
So here, Sam is an outlier
The mean is: Including “Sam” i.e. Outlier
Mean = (0.15+0.11+0.06+0.06+0.12-0.56) / 6 = -0.06 / 6 = -0.01m
So, on average the performance went DOWN.
The mean is: Excluding “Sam” i.e. Outlier
Mean = (0.15+0.11+0.06+0.06+0.12)/5 = 0.1 m .
So, on average the performance went UP
The median (“middle” value):
including Sam is: 0.085
without Sam is: 0.11 (went up a little)
The mode (the most common value):
including Sam is: 0.06
without Sam is: 0.06 (stayed the same)
“ The mode and median didn’t change very much, This happens because in the limitation of mean the mean value is affected by outliers.”
Type of Variable Best measure of central Tendency
- Nominal Mode
- Ordinal Median
- Interval/Ratio (not skewed) Mean
- Interval/Ratio ( Skewed ) Median
In this blog, it gives a better understanding of how to calculate the Measures of Central Tendency, and with the help of using this value give the idea about the description and summarization of data.