R Bar Plots

Bar Charts


A bar chart uses rectangular bars to visualize data. Bar charts can be displayed horizontally or vertically. The height or length of the bars are proportional to the values they represent.

Use the barplot() function to draw a vertical bar chart:
Example:
temperatures <- c(22, 27, 26, 24, 23, 26, 28)

# bar plot of temperatures vector
result <- barplot(temperatures)

print(result)
Output:

# x-axis values
x <- c("A", "B", "C", "D")

# y-axis values
y <- c(2, 4, 6, 8)

barplot(y, names.arg = x)

Output:

Example Explained
  • The x variable represents values in the x-axis (A,B,C,D)
  • The y variable represents values in the y-axis (2,4,6,8)
  • Then we use the barplot() function to create a bar chart of the values
  • names.arg defines the names of each observation in the x-axis

Density / Bar Texture

To change the bar texture, use the density parameter:

Example
x <- c("A", "B", "C", "D")
y <- c(2, 4, 6, 8)

barplot(y, names.arg = x, density = 10)



Bar Width

Use the width parameter to change the width of the bars:

Example
x <- c("A", "B", "C", "D")
y <- c(2, 4, 6, 8)

barplot(y, names.arg = x, width = c(1,2,3,4))

Output:


Horizontal Bars

If you want the bars to be displayed horizontally instead of vertically, use horiz=TRUE:

Example
x <- c("A", "B", "C", "D")
y <- c(2, 4, 6, 8)

barplot(y, names.arg = x, horiz = TRUE,col='green')



Provide Names for Barchart,axis and for Each Bar of Bar Plot in R

We pass the names.arg parameter inside barplot() to provide names for each bar in R. 
For example,
temperatures <- c(22, 27, 26, 24, 23, 26, 28)
result <- barplot(temperatures, main = "Maximum Temperatures in a Week", 
 xlab = "Degree Celsius", ylab = "Day", 
 names.arg = c("Sun", "Mon", "Tue", "Wed", "Thu", "Fri", "Sat"),col='red' ) 

 print(result)

Output:

Stacked Bar Plot in R

R allows us to create stacked bars by using a matrix as input values. For example,
# create a matrix
titanic_data <-  matrix(c(122, 203, 167, 118, 528, 178, 673, 212),
nrow = 2, ncol = 4)

result <- barplot(titanic_data,
main = "Survival of Each Class",
xlab = "Class",
names.arg = c("1st", "2nd", "3rd", "Crew"),
col = c("red","green")
)

legend("topleft",
c("Not survived","Survived"),
fill = c("red","green")
)

print(result)



In the above example, we have created a matrix named titanic_data with 1st row containing data of non-survivals and 2nd row with data of survivals.

Here, we have passed titanic_data inside barplot() to create stacked bars.

We have also used the legend() function to add legend to our bar chart."green" color represents "Survived" "red" color represents "Not Survived".

Instead of a stacked bar we can have different bars for each element in a column juxtaposed to each other by specifying the parameter beside = TRUE as shown below.
# create a matrix
titanic_data <-  matrix(c(122, 203, 167, 118, 528, 178, 673, 212),
nrow = 2, ncol = 4)

result <- barplot(titanic_data,
main = "Survival of Each Class",
xlab = "Class",
names.arg = c("1st", "2nd", "3rd", "Crew"),
col = c("red","green"),
beside=TRUE
)

legend("topleft",
c("Not survived","Survived"),
fill = c("red","green")
)
print(result)

Output:



Plotting Categorical Data

Sometimes we have to plot the count of each item as bar plots from categorical data. For example, here is a vector of age of 10 college freshmen.

age <- c(17,18,18,17,18,19,18,16,18,18)

Simply doing barplot(age) will not give us the required plot. It will plot 10 bars with height equal to the student's age. But we want to know the number of students in each age category.

This count can be quickly found using the table() function, as shown below.
> table(age) 
age 
16 17     18     19 
1     2     6         1

Now plotting this data will give our required bar plot. Note below, that we define the argument density to shade the bars.

barplot(table(age), main="Age Count of 10 Students", xlab="Age", ylab="Count", border="red", col="blue", density=10 )

Example:
age <- c(17,18,18,17,18,19,18,16,18,18)
barplot(table(age), main="Age Count of 10 Students", xlab="Age", ylab="Count",
border="red", col="blue", density=10 )

output:

Sometimes the data is in the form of a contingency table. For example, let us take the built-in Titanic dataset.
This data set provides information on the fate of passengers on the fatal maiden voyage of the ocean liner 'Titanic', summarized according to economic status (class), sex, age and survival.-R documentation.
> Titanic
, , Age = Child, Survived = No 
Sex 
Class Male Female
1st 0 0
2nd 0 0
3rd 35 17
Crew 0 0 , ,
Age = Adult, Survived = No
Sex
Class Male Female
1st 118 42nd 154 13 
3rd 387 89
Crew 670 3 , ,
Age = Child, Survived = Yes
Sex 
Class Male Female
1st 5 1
2nd 11 13
3rd 13 14
Crew 0 0 , ,
Age = Adult, Survived = Yes
Sex
Class Male Female

1st 57 140
2nd 14 80
3rd 75 76
Crew 192 20

We can see that this data has 4 dimensions, class, sex, age and survival. Suppose we wanted to bar plot the count of males and females.
In this case we can use the margin.table() function. This function sums up the table entries according to the given index.

> margin.table(Titanic,1) # count according to class
Class
1st 2nd 3rd Crew
325 285 706 885
> margin.table(Titanic,4) # count according to survival Survived
No Yes
1490 711
> margin.table(Titanic) # gives total count if index is not provided
[1] 2201

Now that we have our data in the required format, we can plot, survival for example, as barplot(margin.table(Titanic,4)) or plot male vs female count as barplot(margin.table(Titanic,2)).
Output:



Example
titanic.data=margin.table(Titanic,1)
barplot(titanic.data,
main = "Passengers in Each Class",
xlab = "Class",
ylab="passengers",
col = c("green")
 )



Comments

Popular posts from this blog

Programming in R - Dr Binu V P

Introduction

R Data Types