How to measure variability of a single variable ?

 

In this blog, we will cover three different measures of variability which are used to measure the variability of a single variable. They are, namely -

  1. Variance
  2. Standard Deviation
  3. Coefficient of variation

Variance

Variance measures the dispersion of a set of data points around their mean.
In statistics, variance refers to the spread of a data set. It’s a measurement used to identify how far each number in the data set is from the mean.

While performing market research, the variance is particularly useful when calculating the probabilities of future events. It is a great way to find all of the possible values and likelihoods that a random variable can take within a given range.

A variance value of zero represents that all of the values within a data set are identical, while all variances that are not equal to zero will come in the form of positive numbers. A large variance means that there is more spread in the dataset, the numbers in a set are far from the mean and each other. A small variance means that the numbers are closer together in value.

One of the primary advantages of variance is that it treats all deviations from the mean of the data set in the same way, regardless of direction. This ensures that the squared deviations cannot sum to zero, which would result in giving the appearance that there was no variability in the data set at all.

Note: Squaring the differences because dispersion cannot be negative (dispersion is about distance and distance cannot be negative). Also, squaring amplifies the effect of large differences.

One of the most commonly discussed disadvantages of variance is that it gives added weight to numbers that are far from the mean, or outliers. Squaring these numbers can at times result in skewed interpretations of the data set as a whole.

Standard Deviation

Standard Deviation which is the square root of the variance is a measure of dispersion like variance. But it is used more often than variance because the unit in which it is measured is the same as that of mean, a measure of central tendency.

Coefficient of variation (CV)

It is purely a numerical value (no unit). When it comes to comparing two datasets, it is better than the standard deviation. Therefore, it is also known as relative standard deviation. 

These three measures are used to compare single variables. When it comes to measuring the relationship between variables, we use covariance and correlation coefficient. I have covered it in this blog.

Comments