In this chapter we will learn how to carry out some basic statistical measurements in R. The measurements are-
The formula for the measurements are
\[ Mean= \frac{1}{n} \sum_{i=1}^{n} x_i\] \[Variance = \frac{1}{n}\sum_{i=1}^{n} (x_i-\bar{x})^2\] \[ Median = l+\dfrac{\frac{N}{2}-f_c}{f}* h\]
\[Range = \text{Max}^m\ \text{value} - \text{Min}^m \ \text{value}\] \[ Covariance = \frac{1}{n}\sum_{i=1}^{n}x_iy_i- \bar{x}\bar{y}\] \[ Correlation \ (r) = \dfrac{Cov(x,y)}{\sqrt{Variance\ (x)}\sqrt{Variance\ (y)}}\] Let us consider the following data
=c(45,81,47,58,61,76,47,44,66,55)
weight=c(140,165,145,160,166,170,156,161,159,163) height
Once the data are being entered in R, we can now perform various statistical calculations:
Generally by mean
we calculate Arithmetic mean
. Arithmetic mean gives the center of our data
=mean(weight)
mean_weight mean_weight
[1] 58
=mean(height)
mean_height mean_height
[1] 158.5
Variance gives the amount of variation present in our data. Or how much our information are scattered from the center of our data
=var(weight)
variance_weight variance_weight
[1] 171.3333
=var(height)
variance_height variance_height
[1] 87.83333
Standard deviation is simply the square root of variance
=sd(weight)
sd_weight sd_weight
[1] 13.08944
=sd(height)
sd_height sd_height
[1] 9.371944
Median is the mid point of our information when arranged in ascending or descending order.
=median(weight)
median_weight median_weight
[1] 56.5
=median(height)
median_height median_height
[1] 160.5
Range gives the minimum and maximum values of the data.
=range(weight)
range_weight range_weight
[1] 44 81
=range(height)
range_height range_height
[1] 140 170
Co-variance is a measurement of variation present in two sets of data when considered together. In our example, we are calculating how weight and height varies together.
=cov(weight, height)
cov_weight_height cov_weight_height
[1] 83.44444
Correlation is the measurement of association or bonding between two variables. In this example, correlation will tell us what kind of relationship weight and height posses.
=cor(weight, height)
correlation
correlation
[1] 0.680216
If the value of correlation is greater than 0, then we say that the two variables are positively correlated.
If the value of correlation is less than 0, then we say that the two variables are negatively correlated.
If the value of correlation is 0, then we say that the two variables are not correlated.
In our example the correlation is 0.680216 which is greater than 0, and hence we say that weight and heights are positively correlated.