Now, we want to test if the coefficients from the linear regression function has a significant impact on the dependent variable (Calorie_Burnage).
This means that we want to prove that it exists a relationship between Average_Pulse and Calorie_Burnage, using statistical tests.
There are four components that explains the statistics of the coefficients:
We will focus on understanding the "P-value" in this module.
The P-value is a statistical number to conclude if there is a relationship between Average_Pulse and Calorie_Burnage.
We test if the true value of the coefficient is equal to zero (no relationship). The statistical test for this is called Hypothesis testing.
Hypothesis testing is a statistical procedure to test if your results are valid.
In our example, we are testing if the true coefficient of Average_Pulse and the intercept is equal to zero.
Hypothesis test has two statements. The null hypothesis and the alternative hypothesis.
Mathematically written:
H0: Average_Pulse = 0
HA: Average_Pulse ≠ 0
H0: Intercept = 0
HA: Intercept ≠ 0
The sign ≠ means "not equal to"
The null hypothesis can either be rejected or not.
If we reject the null hypothesis, we conclude that it exist a relationship between Average_Pulse and Calorie_Burnage. The P-value is used for this conclusion.
A common threshold of the P-value is 0.05.
Note: A P-value of 0.05 means that 5% of the times, we will falsely reject the null hypothesis. It means that we accept that 5% of the times, we might falsely have concluded a relationship.
If the P-value is lower than 0.05, we can reject the null hypothesis and conclude that it exist a relationship between the variables.
However, the P-value of Average_Pulse is 0.824. So, we cannot conclude a relationship between Average_Pulse and Calorie_Burnage.
It means that there is a 82.4% chance that the true coefficient of Average_Pulse is zero.
The intercept is used to adjust the regression function's ability to predict more precisely. It is therefore uncommon to interpret the P-value of the intercept.