read and write xlsx files in R
An xlsx is a file format used for Microsoft Excel spreadsheets. Excel can be used to store tabular data.
R has a built-in functionality that makes it easy to read and write a xlsx file.
Sample xlsx File
To demonstrate how we read xlsx files in R, let's suppose we have an excel file named stud.xlsx with following data:
Install and Load xlsx Package
In order to read, write, and format Excel files into R, we first need to install and load the xlsx package as:
# install xlsx package
install.package("xlsx")
# load xlsx file
library("xlsx")
Here, we have successfully installed and loaded the xlsx package.Now, we are able to read data from an xlsx file.
In R, we use the read.xlsx() function to read a xlsx file available in our current directory. For example,
# install xlsx package
install.package("xlsx")
# load xlsx file
library("xlsx")
# read stud.xlsx file from our current directory
read_data <- read.xlsx("stud.xlsx", sheetIndex = 1)
# display xlsx file
print(read_data)
In the above example, we have read the stud.xlsx file that is available in our current directory. Notice the code,read_data <- read.xlsx("stud.xlsx", sheetIndex = 1)
Here,
read.xlsx() - reads the xlsx file stud.xlsx and creates a dataframe which is stored in the read_data variable.
sheetIndex = 1 - reads specified worksheet i.e. 1Note:If the file is in some other location, we have to specify the path along with the file name as: read.xlsx("D:/folder1/studentinfo.xlsx", sheetIndex = 1).
We can also use the read.xlsx2() function if the dataset we are working on is larger.In R, we can also read a specific range of data from excel files. We can pass the rowIndex and colIndex argument inside read.xlsx() to read specific range.
rowIndex - reads a specific range of rows
colIndex - read a specific range of columns
colIndex - read a specific range of columns
Example: Read Range of Rows
install.package("xlsx")
# load xlsx file
library("xlsx")
# read first five rows of xlsx file
read_data <- read.xlsx("stud.xlsx", sheetIndex = 1, rowIndex = 1:5 )
# display xlsx file
print(read_data)
In the above example, we have passed rowIndex = 1:5 inside read.xlsx() so the function reads only the first five rows from the stud.xlsx file.
Example: Read Range of Columns
# install xlsx package
install.package("xlsx")
# load xlsx file
library("xlsx")
# read first three columns of xlsx file
read_data <- read.xlsx("stud.xlsx",
sheetIndex = 1,
colIndex = 1:3
)
# display xlsx file
print(read_data)
xlsx startRow Argument in R
Sometimes the excel file may contain headers at the beginning which we may not want to include
To start reading data from a specific row in excel worksheet, we pass the startRow argument inside read.xlsx().
# install xlsx package
install.package("xlsx")
# load xlsx file
library("xlsx")
# start reading from 3rd row
read_data <- read.xlsx("stud.xlsx",
sheetIndex = 1,
startRow = 3
)
# display xlsx file
print(read_data)
in the above example, we have used the startRow argument inside the read.xlsx() function to start reading from the specified row.
startRow = 3 means the first two rows are ignored and read.xlsx() starts reading data from the 3rd row.
In R, we use the write.xlsx() function to write into a xlsx file. We pass the data in the form of dataframe. For example,
# install xlsx package
install.package("xlsx")
# load xlsx file
library("xlsx")
# create a data frame
dataframe1 <- data.frame (
Name = c("Juan", "Alcaraz", "Simantha"),
Age = c(22, 15, 19),
Vote = c(TRUE, FALSE, TRUE))
# write dataframe1 into file1 xlsx file
write.xlsx(dataframe1, "file1.xlsx")
Rename Current Worksheet
We can rename the current worksheet by using the sheetName argument inside the write.xlsx() function. For example,
# install xlsx package install.package("xlsx")
# load xlsx file
library("xlsx")
# create a data frame
dataframe1 <- data.frame (
Name = c("Juan", "Alcaraz", "Simantha"),
Age = c(22, 15, 19),
Vote = c(TRUE, FALSE, TRUE))
# name current worksheet
write.xlsx(dataframe1, "file1.xlsx",
sheetName = "Voting Eligibility"
)
Comments
Post a Comment