read and write xlsx files in R


An xlsx is a file format used for Microsoft Excel spreadsheets. Excel can be used to store tabular data.

R has a built-in functionality that makes it easy to read and write a xlsx file.

Sample xlsx File

To demonstrate how we read xlsx files in R, let's suppose we have an excel file named stud.xlsx with following data:


Install and Load xlsx Package


In order to read, write, and format Excel files into R, we first need to install and load the xlsx package as:
# install xlsx package 
install.package("xlsx") 
 # load xlsx file 
library("xlsx")

Here, we have successfully installed and loaded the xlsx package.Now, we are able to read data from an xlsx file.

Read a xlsx File in R

In R, we use the read.xlsx() function to read a xlsx file available in our current directory. For example,

# install xlsx package 
 install.package("xlsx")
# load xlsx file 
library("xlsx") 
# read stud.xlsx file from our current directory 
read_data <- read.xlsx("stud.xlsx", sheetIndex = 1) 
# display xlsx file 
print(read_data)

In the above example, we have read the stud.xlsx file that is available in our current directory. Notice the code,read_data <- read.xlsx("stud.xlsx", sheetIndex = 1)


Here,
read.xlsx() - reads the xlsx file stud.xlsx and creates a dataframe which is stored in the read_data variable.
sheetIndex = 1 - reads specified worksheet i.e. 1

Note:If the file is in some other location, we have to specify the path along with the file name as: read.xlsx("D:/folder1/studentinfo.xlsx", sheetIndex = 1).
We can also use the read.xlsx2() function if the dataset we are working on is larger.

xlsx rowIndex and colIndex Argument in R

In R, we can also read a specific range of data from excel files. We can pass the rowIndex and colIndex argument inside read.xlsx() to read specific range.

rowIndex - reads a specific range of rows
colIndex - read a specific range of columns

Example: Read Range of Rows
# install xlsx package 
install.package("xlsx") 
# load xlsx file 
library("xlsx") 
# read first five rows of xlsx file 
read_data <- read.xlsx("stud.xlsx", sheetIndex = 1, rowIndex = 1:5 ) 
# display xlsx file 
print(read_data)

In the above example, we have passed rowIndex = 1:5 inside read.xlsx() so the function reads only the first five rows from the stud.xlsx file.


Example: Read Range of Columns
# install xlsx package 
 install.package("xlsx") 
 # load xlsx file 
library("xlsx") 
 # read first three columns of xlsx file 
read_data <- read.xlsx("stud.xlsx", sheetIndex = 1, colIndex = 1:3 )
 # display xlsx file 
print(read_data)

Here, colIndex = 1:3 inside read.xlsx() reads only the first three columns from the stud.xlsx file.

xlsx startRow Argument in R

Sometimes the excel file may contain headers at the beginning which we may not want to include

To start reading data from a specific row in excel worksheet, we pass the startRow argument inside read.xlsx().

Let's take a look at example,
# install xlsx package 
 install.package("xlsx") 
 # load xlsx file 
library("xlsx") 
 # start reading from 3rd row 
read_data <- read.xlsx("stud.xlsx", sheetIndex = 1, startRow = 3 ) 
 # display xlsx file 
print(read_data)

in the above example, we have used the startRow argument inside the read.xlsx() function to start reading from the specified row.

startRow = 3 means the first two rows are ignored and read.xlsx() starts reading data from the 3rd row.

Write Into xlsx File in R

In R, we use the write.xlsx() function to write into a xlsx file. We pass the data in the form of dataframe. For example,

# install xlsx package 
 install.package("xlsx") 
 # load xlsx file 
library("xlsx") 
 # create a data frame 
dataframe1 <- data.frame ( Name = c("Juan", "Alcaraz", "Simantha"), Age = c(22, 15, 19), Vote = c(TRUE, FALSE, TRUE)) 
 # write dataframe1 into file1 xlsx file 
write.xlsx(dataframe1, "file1.xlsx")


Rename Current Worksheet


We can rename the current worksheet by using the sheetName argument inside the write.xlsx() function. For example,
# install xlsx package 
 install.package("xlsx") 
 # load xlsx file library("xlsx") 
 # create a data frame 
dataframe1 <- data.frame ( Name = c("Juan", "Alcaraz", "Simantha"), Age = c(22, 15, 19), Vote = c(TRUE, FALSE, TRUE)) 
 # name current worksheet 
write.xlsx(dataframe1, "file1.xlsx", sheetName = "Voting Eligibility" )

Comments

Popular posts from this blog

Programming in R - Dr Binu V P

R Data Types

R- Linear Regression