A matrix is an array of numbers arranged in rows and columns.
A correlation matrix is simply a table showing the correlation coefficients between variables.
Here, the variables are represented in the first row, and in the first column:
The table above has used data from the full health data set.
Observations:
We can use the corr()
function in Python to create a correlation matrix. We also use the round()
function to round the output to two decimals:
Output:
We can use a Heatmap to Visualize the Correlation Between Variables:
The closer the correlation coefficient is to 1, the greener the squares get.
The closer the correlation coefficient is to -1, the browner the squares get.
We can use the Seaborn library to create a correlation heat map (Seaborn is a visualization library based on matplotlib):
import matplotlib.pyplot as plt
import seaborn as sns
correlation_full_health = full_health_data.corr()
axis_corr = sns.heatmap(
correlation_full_health,
vmin=-1, vmax=1, center=0,
cmap=sns.diverging_palette(50, 500, n=500),
square=True
)
plt.show()
Try it Yourself »