R Data Frames


Data Frames


Data Frames are data displayed in a format as a table.

Data Frames can have different types of data inside it. While the first column can be character, the second and third can be numeric or logical. However, each column should have the same type of data.

Use the data.frame() function to create a data frame:
# Create a data frame
Data_Frame <- data.frame (
   Name = c("Binu", "Aditi", "Abhi"),
   RollNo = c(101, 102,103),
   Mark = c(60, 70, 45)
)

# Print the data frame
Data_Frame
Output:
Name RollNo Mark
1  Binu    101   60
2 Aditi    102   70
3  Abhi    103   45


Check if a variable is a data frame or not


We can check if a variable is a data frame or not using the class() function.
# Create a data frame
Data_Frame <- data.frame (
   Name = c("Binu", "Aditi", "Abhi"),
   RollNo = c(101, 102,103),
   Mark = c(60, 70, 45)
)

# Print the class of data frame
class(Data_Frame)
Output:
[1] "data.frame"

Structure of the Data Frame
# Create a data frame
Data_Frame <- data.frame (
   Name = c("Binu", "Aditi", "Abhi"),
   RollNo = c(101, 102,103),
   Mark = c(60, 70, 45)
)

# Print the structure of data frame
str(Data_Frame)
Output:
'data.frame': 3 obs. of  3 variables:
$ Name  :chr  "Binu" "Aditi" "Abhi"
$ RollNo:num  101 102 103
$ Mark  :num  60 70 45

Summarize the Data

Use the summary() function to summarize the data from a Data Frame:
# Create a data frame
Data_Frame <- data.frame (
   Name = c("Binu", "Aditi", "Abhi"),
   RollNo = c(101, 102,103),
   Mark = c(60, 70, 45)
)

# Print the data frame
Data_Frame
summary(Data_Frame)
Output:
Name RollNo Mark
1  Binu    101   60
2 Aditi    102   70
3  Abhi    103   45
Name               RollNo           Mark      
 Length:3           Min.   :101.0   Min.   :45.00  
 Class :character   1st Qu.:101.5   1st Qu.:52.50  
 Mode  :character   Median :102.0   Median :60.00  
                    Mean   :102.0   Mean   :58.33  
                    3rd Qu.:102.5  3rd Qu.:65.00  
                    Max.   :103.0   Max.   :70.00  

Note:You will learn more about the summary() function in the statistical part of the R tutorial.

Access Items

We can use single brackets [ ], double brackets [[ ]] or $ to access columns from a data frame:
# Create a data frame
Data_Frame <- data.frame (
   Name = c("Binu", "Aditi", "Abhi"),
   RollNo = c(101, 102,103),
   Mark = c(60, 70, 45)
)

# Print the data frame
Data_Frame
Data_Frame[1]
Data_Frame[[2]]
Data_Frame$Mark

Output:
Name RollNo Mark
1  Binu    101   60
2 Aditi    102   70
3  Abhi    103   45
Name
1  Binu
2 Aditi
3  Abhi
[1] 101 102 103
[1] 60 70 45

Add Rows

Use the rbind() function to add new rows in a Data Frame:
# Create a data frame
Data_Frame <- data.frame (
   Name = c("Binu", "Aditi", "Abhi"),
   RollNo = c(101, 102,103),
   Mark = c(60, 70, 45)
)

# Add new row and Print the data frame
New_Data_Frame=rbind(Data_Frame,c("Padma",104,75))
New_Data_Frame

Output:
Name RollNo Mark
1  Binu    101   60
2 Aditi    102   70
3  Abhi    103   45
4 Padma    104   75

Add Columns

Use the cbind() function to add new columns in a Data Frame:

# Create a data frame
Data_Frame <- data.frame (
   Name = c("Binu", "Aditi", "Abhi"),
   RollNo = c(101, 102,103),
   Mark = c(60, 70, 45)
)

# Add new column and Print the data frame
New_Data_Frame=cbind(Data_Frame,Grade=c("C","B","D"))
New_Data_Frame

Output;
Name RollNo Mark Grade
1  Binu    101   60     C
2 Aditi    102   70     B
3  Abhi    103   45     D

Remove Rows and Columns

Use the c() function to remove rows and columns in a Data Frame:
# Create a data frame
Data_Frame <- data.frame (
   Name = c("Binu", "Aditi", "Abhi"),
   RollNo = c(101, 102,103),
   Mark = c(60, 70, 45)
)

# Remove first row and column and Print the data frame
New_Data_Frame=Data_Frame[-c(1),-c(1)]
New_Data_Frame
Output:
RollNo Mark
2    102   70
3    103   45


Amount of Rows and Columns

Use the dim() function to find the amount of rows and columns in a Data Frame
You can also use the ncol() function to find the number of columns and nrow() to find the number of rows

# Create a data frame
Data_Frame <- data.frame (
   Name = c("Binu", "Aditi", "Abhi"),
   RollNo = c(101, 102,103),
   Mark = c(60, 70, 45)
)

# Print the data frame dimension
dim(Data_Frame)
ncol(Data_Frame)
nrow(Data_Frame)

Output:
[1] 3 3
[1] 3
[1] 3

Data Frame Length

Use the length() function to find the number of columns in a Data Frame (similar to ncol()):
# Create a data frame
Data_Frame <- data.frame (
   Name = c("Binu", "Aditi", "Abhi"),
   RollNo = c(101, 102,103),
   Mark = c(60, 70, 45)
)

# Print the data frame length
length(Data_Frame)
Output:
[1] 3

Combining Data Frames

Use the rbind() function to combine two or more data frames in R vertically:
Data_Frame1 <- data.frame (
   Name = c("Binu", "Aditi", "Abhi"),
   RollNo = c(101, 102,103),
   Mark = c(60, 70, 45)
)
Data_Frame2 <- data.frame (
   Name = c("Sus", "Kichu", "Ponnu"),
   RollNo = c(104, 105,106),
   Mark = c(50, 80, 75)
)
New_Data_Frame <- rbind(Data_Frame1, Data_Frame2)
New_Data_Frame


Output:
Name RollNo Mark
1  Binu    101   60
2 Aditi    102   70
3  Abhi    103   45
4   Sus    104   50
5  Kichu   105   80
6 Ponnu    106   75

And use the cbind() function to combine two or more data frames in R horizontally:
Data_Frame3 <- data.frame (
  Training = c("Strength", "Stamina", "Other"),
  Pulse = c(100, 150, 120),
  Duration = c(60, 30, 45)
)

Data_Frame4 <- data.frame (
  Steps = c(3000, 6000, 2000),
  Calories = c(300, 400, 300)
)

New_Data_Frame1 <- cbind(Data_Frame3, Data_Frame4)
New_Data_Frame1

Output:
Training Pulse Duration Steps Calories
1 Strength   100       60  3000      300
2 Stamina   150       30   6000      400
3 Other       120        45  2000      300

Select top rows

# Create a data frame
Data_Frame <- data.frame (
   Name = c("Binu", "Aditi", "Abhi","Manu"),
   RollNo = c(101, 102,103,104),
   Mark = c(60, 70, 45,55)
)

# Print the top two rows of data frame
head(Data_Frame,n=2)

Print second and third row
# Create a data frame
Data_Frame <- data.frame (
   Name = c("Binu", "Aditi", "Abhi","Manu"),
   RollNo = c(101, 102,103,104),
   Mark = c(60, 70, 45,55)
)

# Print second and third row of  data frame
Data_Frame[2:3,]

Select specific row and column
# Create a data frame
Data_Frame <- data.frame (
   Name = c("Binu", "Aditi", "Abhi","Manu"),
   RollNo = c(101, 102,103,104),
   Mark = c(60, 70, 45,55)
)

# Print second and third row and 1and 2 column of  data frame
Data_Frame[2:3,1:2]

Output:
Name RollNo
2 Aditi    102
3  Abhi    103


How to modify a Data Frame in R?

Data frames can be modified like we modified matrices through reassignment.
# Create a data frame
Data_Frame <- data.frame (
   Name = c("Binu", "Aditi", "Abhi","Manu"),
   RollNo = c(101, 102,103,104),
   Mark = c(60, 70, 45,55)
)
#Modify 2 row name column
Data_Frame[2,"Name"]='Padma'
# Modify 2 row 2 column
Data_Frame[2,2]=110
Data_Frame

Output:
Name RollNo Mark
1  Binu    101   60
2 Padma  110   70
3  Abhi    103   45
4  Manu   104   55



Comments

Popular posts from this blog

Programming in R - Dr Binu V P

R Data Types

R- Linear Regression