In this chapter we will learn how to carry out some basic statistical measurements in R. The measurements are-
The formula for the measurements are
\[ Mean= \frac{1}{n} \sum_{i=1}^{n} x_i\] \[Variance = \frac{1}{n}\sum_{i=1}^{n} (x_i-\bar{x})^2\] \[ Median = l+\dfrac{\frac{N}{2}-f_c}{f}* h\]
\[Range = \text{Max}^m\ \text{value} - \text{Min}^m \ \text{value}\] \[ Covariance = \frac{1}{n}\sum_{i=1}^{n}x_iy_i- \bar{x}\bar{y}\] \[ Correlation \ (r) = \dfrac{Cov(x,y)}{\sqrt{Variance\ (x)}\sqrt{Variance\ (y)}}\] Let us consider the following data
weight=c(45,81,47,58,61,76,47,44,66,55)
height=c(140,165,145,160,166,170,156,161,159,163)Once the data are being entered in R, we can now perform various statistical calculations:
Generally by mean we calculate Arithmetic mean. Arithmetic mean gives the center of our data
mean_weight=mean(weight)
mean_weight[1] 58
mean_height=mean(height)
mean_height[1] 158.5
Variance gives the amount of variation present in our data. Or how much our information are scattered from the center of our data
variance_weight=var(weight)
variance_weight[1] 171.3333
variance_height=var(height)
variance_height[1] 87.83333
Standard deviation is simply the square root of variance
sd_weight=sd(weight)
sd_weight[1] 13.08944
sd_height=sd(height)
sd_height[1] 9.371944
Median is the mid point of our information when arranged in ascending or descending order.
median_weight=median(weight)
median_weight[1] 56.5
median_height=median(height)
median_height[1] 160.5
Range gives the minimum and maximum values of the data.
range_weight=range(weight)
range_weight[1] 44 81
range_height=range(height)
range_height[1] 140 170
Co-variance is a measurement of variation present in two sets of data when considered together. In our example, we are calculating how weight and height varies together.
cov_weight_height=cov(weight, height)
cov_weight_height[1] 83.44444
Correlation is the measurement of association or bonding between two variables. In this example, correlation will tell us what kind of relationship weight and height posses.
correlation=cor(weight, height)
correlation[1] 0.680216
If the value of correlation is greater than 0, then we say that the two variables are positively correlated.
If the value of correlation is less than 0, then we say that the two variables are negatively correlated.
If the value of correlation is 0, then we say that the two variables are not correlated.
In our example the correlation is 0.680216 which is greater than 0, and hence we say that weight and heights are positively correlated.