Standard Deviation in R

Being a statistical language, R offers standard function sd(' ') to find the standard deviation of the values.

What is the Standard Deviation?

‘Standard deviation is the measure of the dispersion of the values’.
The higher the deviation, the wider the spread of values.
The lower the deviation, the narrower the spread of values.
In simple words the formula is defined as – deviation is the square root of the ‘variance’.

Importance of Deviation

  • Standard deviation converts the negative number to a positive number by squaring it.
  • It shows the larger deviations so that you can particularly look over them.
  • It shows the central tendency, which is a very useful function in the analysis.
  • It has a major role to play in finance, business, analysis, and measurements.

Before we roll into the topic, keep this definition in your mind!

Variance – It is defined as the squared differences between the observed value and expected value.

Find the Standard Deviation in R for Values in a List

In this method, we will create a list x and add some values to it. Then we can find the deviation of those values in the list.

 
 x <- c(34,56,87,65,34,56,89)    #creates list 'x' with some values in it.

 sd(x)  #calculates the standard deviation of the values in the list 'x'

[/dm_code_snippet]

Output —> 22.28175

Now we can try to extract specific values from the list y to find the deviation.

 
 y <- c(34,65,78,96,56,78,54,57,89)  #creates a list 'y' having some values
 
data1 <- y[1:5] #extract specific values using its Index

sd(data1) #calculates the standard deviation for Indexed or extracted values from the list.

[/dm_code_snippet]

Output —> 23.28519

Finding the Standard Deviation of Values Stored in a CSV File

In this method, we are importing a CSV file to find the deviation in R for the values which are stored in that file.

 
readfile <- read.csv('testdata1.csv')  #reading a csv file

data2 <- readfile$Values      #getting values stored in the header 'Values'

sd(data2)                              #calculates the standard deviation  

[/dm_code_snippet]

Output —> 17.88624

High and Low Deviation

In general, the values will be so close to the average value in low standard deviation and the values will be far spread from the average value in high deviation.

We can illustrate this with an example.

 
x <- c(79,82,84,96,98) mean(x) --->  82.22222
sd(x)
--->  10.58038

[/dm_code_snippet]

To plot these values in a bar graph using R, run the below code.

 
install.packages("ggplot2")

library(ggplot2)

values <- data.frame(marks=c(79,82,84,96,98), students=c(0,1,2,3,4,))
head(values)                  #displays the values
x <- ggplot(values, aes(x=marks, y=students))+geom_bar(stat='identity')
x                             #displays the plot

[/dm_code_snippet]

Illustration for High Standard Deviation

 
y <- c(23,27,30,35,55,76,79,82,84,94,96) mean(y) ---> 61.90909
sd(y)
---> 28.45507

[/dm_code_snippet]

Conclusion

Finding the standard deviation of the values in R is easy. R offers a function sd(' ') to find it. You can create a list of values or import a CSV file to find the deviation.

Important: Don’t forget to calculate it by extracting some values from a file or a list through indexing as shown above.

Use the comment box to post any kind of doubts regarding the sd(' ') function in R. Happy learning!!!

Create a Free Account

Register now and get access to our Cloud Services.

Posts you might be interested in: