The Unique Function in R

The unique() function in R is used to eliminate or delete the duplicate values or the rows present in the vector, data frame, or matrix as well.

The unique() function found its importance in the EDA (Exploratory Data Analysis) as it directly identifies and eliminates the duplicate values in the data.

Understanding the Application of the Unique() Function

In this article, we are going to unleash the various application of the unique() function in R programming. Let’s roll!!!

The Idea of Getting Unique Values

Well, before going into the topic, it’s good to know the idea behind it. In this case, it is unique values. The unique function will return the unique values by eliminating the duplicate counts.

Unique Function In R

The diagram tells you that the unique function will look for duplicates and eliminates that to return the unique values. There are many illustrations coming your way in the following sections to teach something good.

The Syntax of the Unique() Function in R

Unique: The unique() function is used to identify and eliminate the duplicate counts present in the data.

Where:

X = It can be a vector, a data frame or a matrix.

A Simple Example of Unique() Function in R

If you have a vector that has duplicate values, then with the help of the unique() function you can easily eliminate those using a single line of code.

Let’s see how it works…

#An input vector having duplicate values
df<-c(1,2,3,2,4,5,1,6,8,9,8,6)

#eliminates the duplicate values in the vector 
unique(df)
Output = 1 2 3 4 5 6 8 9

In the above illustration you may observe that, the input vector has many duplicate values.

After we passed that vector to unique function, it eliminates all the duplicate values and returns only the unique values as shown above.

Finding the Unique Values in a Matrix

Now, we are going to find duplicate values present in a matrix and eliminate them using the unique function.

For this, we have to first create a matrix of ‘n’ rows and columns having the duplicate values.

To create a matrix, run the below code.

#creates a 6 x 4 matrix having 24 elements 
df<-matrix(rep(1:20,length.out=24),nrow = 6,ncol=4,byrow = T)
      [,1] [,2] [,3] [,4]
[1,]    1    2    3    4
[2,]    5    6    7    8
[3,]    9   10   11   12
[4,]   13   14   15   16
[5,]   17   18   19   20
[6,]    1    2    3    4

As you can easily notice that, the last row is entirely duplicated. All you need to do is by using the unique() function, eliminate these duplicate values.

#removes the duplicate values
unique(df)
       [,1] [,2] [,3] [,4]
[1,]    1    2    3    4
[2,]    5    6    7    8
[3,]    9   10   11   12
[4,]   13   14   15   16
[5,]   17   18   19   20

YaY!

You did it! All the duplicate values present in the matrix were get removed by the unique function and it returned a matrix having unique values alone.

Finding the Unique Values in the Dataframe

Till now, we worked on the vectors and the matrices to extract the unique values by eliminating the duplicate counts.

In this section, let’s focus on getting the unique values present in the data frame.

To create a data frame run the below code.

#creates a data frame
> Class_data<-data.frame(Student=c('Naman','Megh','Mark','Naman','Megh','Mark'),Age=c(22,23,24,22,23,24),Gender=c('Male','Female','Male','Male','Female','Male'))

#dataframe
Class_data
   Student Age Gender
1   Naman  22   Male
2    Megh  23  Female
3    Mark  24   Male
4   Naman  22   Male
5    Megh  23  Female
6    Mark  24   Male

This is the data frame which has the duplicate counts as shown above. Let’s apply the unique function to get rid of the duplicate value present here.

unique(Class_data)
   Student Age  Gender
1   Naman  22   Male
2    Megh  23  Female
3    Mark  24   Male

Wow! The unique function returned all the unique values present in the dataframe by eliminating the duplicate values.

Just like this, by using the unique() function in R, you can easily get the unique values present in the data.

Finding the Unique Values of a Particular Column

Yes, what if you are required to get the unique values out of a specific column instead of data set?

Worry not, using the unique() function we can also get the unique values out of particular column as shown below.

#creates a data frame
> Class_data<-data.frame(Student=c('Naman','Megh','Mark','Naman','Megh','Mark'),Age=c(22,23,24,22,23,24),Gender=c('Male','Female','Male','Male','Female','Male'))

#dataframe
Class_data
   Student Age Gender
1   Naman  22   Male
2    Megh  23  Female
3    Mark  24   Male
4   Naman  22   Male
5    Megh  23  Female
6    Mark  24   Male

Okay, I am taking the same data frame that we used in the last sections for easy understanding.

Let’s use unique function to get rid of duplicate values.

unique(Class_data$Student)
Output = "Naman" "Megh"  "Mark" 

In the same way, we can also get the unique values in the Age or Gender columns as well.

unique(Class_data$Gender)
"Male"   "Female"

Finding the Length of the Unique Values

In this section, we are going to get the count of the unique values in the data. This application is more useful to know your data better and get it ready for further analysis.

#importing the dataset

datasets::BOD
    Time  demand
1    1     8.3
2    2    10.3
3    3    19.0
4    4    16.0
5    5    15.6
6    7    19.8

Well, we are using the BOD dataset here. Let’s find the unique values first which will be followed by the count.

#returns the unique value
unique(BOD$demand)
Output = 8.3  10.3 19.0  16.0  15.6  19.8

Okay, now we have the unique values present in the demand column in the BOD dataset.

Now, we are good to go to find the count of the unique values.

#returns the length of unique values
length(unique(BOD$demand))
Output =  6

Wrapping Up

Well, the unique() function in R is a very valuable one when it comes to EDA (Exploratory Data Analysis).

It helps you to get a better understanding of your data along with particular counts.

This article tells you about the multiple applications and use cases of the unique() function. Happy analyzing!!!

Create a Free Account

Register now and get access to our Cloud Services.

Posts you might be interested in:

centron Managed Cloud Hosting in Deutschland

Dimension Reduction – IsoMap

Python
Dimension Reduction – IsoMap Content1 Introduction2 Prerequisites for Dimension Reduction3 Why Geodesic Distances Are Better for Dimension Reduction4 Dimension Reduction: Steps of the IsoMap Algorithm5 Landmark Isomap6 Drawbacks of Isomap7…
centron Managed Cloud Hosting in Deutschland

What Every ML/AI Developer Should Know About ONNX

Python
What Every ML/AI Developer Should Know About ONNX Content1 Introduction2 ONNX Overview3 Prerequisites for ML/AI Developer4 ONNX in Practice for ML/AI Developer5 Conclusion for What Every ML/AI Developer Should Know…