Measures of relationship between variable | Correlation and Co-variance coefficient

 



While performing EDA (Exploratory Data Analysis) the most crucial step is to find the relationship between two or more variables to understand how one behaves when the other variable tends to change. This helps us to figure out the significance of each independent variable on the target and thus, create a model with a reduced number of parameters (only the most important or significant ones).

Covariance

Covariance provides insight into how two variables are related to one another. More precisely, covariance refers to the measure of how two random variables in a data set will change together. A positive covariance means that the two variables at hand are positively related, and they move in the same direction. A negative covariance means that the variables are inversely related, or that they move in opposite directions.

Correlation Coefficient

When the correlation coefficient is zero, it means that there is no identifiable relationship between the variables. If one variable move, it’s impossible to make predictions about the movement of the other variable. If the correlation coefficient is a negative one, this means that the variables are perfectly negatively or inversely correlated. If one variable increases, the other will decrease at the same proportion. The variables will move in opposite directions from each other. If the correlation coefficient is greater than the negative one, it indicates that there is an imperfect negative correlation. As the correlation approaches a negative one, the correlation grows.



Note: ρ(x, y) = ρ(y, x). Correlation does not imply causation
What does the above statement "Correlation does not apply causation" mean? We will cover it in a separate blog.

Comments