Showing posts with label colors. Show all posts
Showing posts with label colors. Show all posts

Plotting in R Part II - Grid Lines

We will start from end of Part I.

Recap:

The libraries and data set needed. 
Library: ggplot2, stringr
Dataset: iris (to be loaded from local drive)
IDE: RStudio

# First load the library
library(ggplot2)

# The set your working directory
setwd("C:/R_Train") # please replace with your own directory

# Read the file and create a data frame
tbl <- read.csv("Sample Data/iris.csv")


Get the csv file  from here.
data source 


Modify the grid lines

The grid can be modified individually using panel.grid.major.x,   panel.grid.major.y,  panel.grid.minor.x and panel.grid.minor.y.


  • change the color of the major grid of y axis.

Code
ggplot(tbl, aes(x=Name, y=SepalLength, fill = Name)) + 
  geom_bar(stat="identity", width = 0.3) +
  theme(axis.title.x = element_text(size = 20, color = "blue", face = "italic")) +
  theme(axis.text.x = element_text(size = 10, color = "firebrick3", face = "bold")) +
  scale_x_discrete(labels = function(x) str_wrap(x, width = 10)) + 
  theme(axis.title.y = element_text(size = 20, color = "blue", face = "italic")) +
  scale_y_continuous(limits = c(0,350), breaks = seq(0,350,50)) +
  scale_fill_manual("legend", values = c("Iris-setosa"  = "indianred3", "Iris-versicolor" = "lightcyan2", "Iris-virginica" = "darkolivegreen2")) +
  theme(panel.grid.major.y = element_line(colour = "black"))

Output

  • Change the color of the minor grid of y axis.

Code

ggplot(tbl, aes(x=Name, y=SepalLength, fill = Name)) + 
  geom_bar(stat="identity", width = 0.3) +
  theme(axis.title.x = element_text(size = 20, color = "blue", face = "italic")) +
  theme(axis.text.x = element_text(size = 10, color = "firebrick3", face = "bold")) +
  scale_x_discrete(labels = function(x) str_wrap(x, width = 10)) + 
  theme(axis.title.y = element_text(size = 20, color = "blue", face = "italic")) +
  scale_y_continuous(limits = c(0,350), breaks = seq(0,350,50)) +
  scale_fill_manual("legend", values = c("Iris-setosa"  = "indianred3", "Iris-versicolor" = "lightcyan2", "Iris-virginica" = "darkolivegreen2")) +
  theme(panel.grid.major.y = element_line(colour = "black"), panel.grid.minor.y = element_line(colour = "blue"))

Output


  • Change the color of the grid of x axis
Code
ggplot(tbl, aes(x=Name, y=SepalLength, fill = Name)) + 
  geom_bar(stat="identity", width = 0.3) +

  theme(axis.title.x = element_text(size = 20, color = "blue", face = "italic")) +
  theme(axis.text.x = element_text(size = 10, color = "firebrick3", face = "bold")) +
  scale_x_discrete(labels = function(x) str_wrap(x, width = 10)) + 
  theme(axis.title.y = element_text(size = 20, color = "blue", face = "italic")) +
  scale_y_continuous(limits = c(0,350), breaks = seq(0,350,50)) +
  scale_fill_manual("legend", values = c("Iris-setosa"  = "indianred3", "Iris-versicolor" = "lightcyan2", "Iris-virginica" = "darkolivegreen2")) +
  theme(panel.grid.major.y = element_line(colour = "black"), panel.grid.minor.y = element_line(colour = "blue")) +
  theme(panel.grid.major.x = element_line(colour = "black"))


Output




  • Change major grids color at once.

Code

ggplot(tbl, aes(x=Name, y=SepalLength, fill = Name)) + 
  geom_bar(stat="identity", width = 0.3) +
  theme(axis.title.x = element_text(size = 20, color = "blue", face = "italic")) +
  theme(axis.text.x = element_text(size = 10, color = "firebrick3", face = "bold")) +
  scale_x_discrete(labels = function(x) str_wrap(x, width = 10)) + 
  theme(axis.title.y = element_text(size = 20, color = "blue", face = "italic")) +
  scale_y_continuous(limits = c(0,350), breaks = seq(0,350,50)) +
  scale_fill_manual("legend", values = c("Iris-setosa"  = "indianred3", "Iris-versicolor" = "lightcyan2", "Iris-virginica" = "darkolivegreen2")) +
  theme(panel.grid.major = element_line(colour = "black"))

Output
  • Change all the grids at once

Code
ggplot(tbl, aes(x=Name, y=SepalLength, fill = Name)) + 
  geom_bar(stat="identity", width = 0.3) +
  theme(axis.title.x = element_text(size = 20, color = "blue", face = "italic")) +
  theme(axis.text.x = element_text(size = 10, color = "firebrick3", face = "bold")) +
  scale_x_discrete(labels = function(x) str_wrap(x, width = 10)) + 
  theme(axis.title.y = element_text(size = 20, color = "blue", face = "italic")) +
  scale_y_continuous(limits = c(0,350), breaks = seq(0,350,50)) +
  scale_fill_manual("legend", values = c("Iris-setosa"  = "indianred3", "Iris-versicolor" = "lightcyan2", "Iris-virginica" = "darkolivegreen2")) +
  theme(panel.grid = element_line(colour = "black"))


Output


But in some cases you don't want the grid lines. 

  • remove x grid lines
Code
ggplot(tbl, aes(x=Name, y=SepalLength, fill = Name)) + 
  geom_bar(stat="identity", width = 0.3) +
  theme(axis.title.x = element_text(size = 20, color = "blue", face = "italic")) +
  theme(axis.text.x = element_text(size = 10, color = "firebrick3", face = "bold")) +
  scale_x_discrete(labels = function(x) str_wrap(x, width = 10)) + 
  theme(axis.title.y = element_text(size = 20, color = "blue", face = "italic")) +
  scale_y_continuous(limits = c(0,350), breaks = seq(0,350,50)) +
  scale_fill_manual("legend", values = c("Iris-setosa"  = "indianred3", "Iris-versicolor" = "lightcyan2", "Iris-virginica" = "darkolivegreen2")) +
  theme(panel.grid.major.x = element_blank())


Output



  • remove the y minor grid lines
Code

ggplot(tbl, aes(x=Name, y=SepalLength, fill = Name)) + 
  geom_bar(stat="identity", width = 0.3) +
  theme(axis.title.x = element_text(size = 20, color = "blue", face = "italic")) +
  theme(axis.text.x = element_text(size = 10, color = "firebrick3", face = "bold")) +
  scale_x_discrete(labels = function(x) str_wrap(x, width = 10)) + 
  theme(axis.title.y = element_text(size = 20, color = "blue", face = "italic")) +
  scale_y_continuous(limits = c(0,350), breaks = seq(0,350,50)) +
  scale_fill_manual("legend", values = c("Iris-setosa"  = "indianred3", "Iris-versicolor" = "lightcyan2", "Iris-virginica" = "darkolivegreen2")) +
  theme(panel.grid.major.x = element_blank(), panel.grid.minor.y = element_blank())

Output


  • remove all the grid lines
Code
ggplot(tbl, aes(x=Name, y=SepalLength, fill = Name)) + 
  geom_bar(stat="identity", width = 0.3) +
  theme(axis.title.x = element_text(size = 20, color = "blue", face = "italic")) +
  theme(axis.text.x = element_text(size = 10, color = "firebrick3", face = "bold")) +
  scale_x_discrete(labels = function(x) str_wrap(x, width = 10)) + 
  theme(axis.title.y = element_text(size = 20, color = "blue", face = "italic")) +
  scale_y_continuous(limits = c(0,350), breaks = seq(0,350,50)) +
  scale_fill_manual("legend", values = c("Iris-setosa"  = "indianred3", "Iris-versicolor" = "lightcyan2", "Iris-virginica" = "darkolivegreen2")) +
  theme(panel.grid = element_blank())

Output



That is the end of part 2 of Plotting In R. 

If you have any specific request configuring ggplot2, please leave a comment. I will try to add to in the current posts or cover that in future posts. 


Plotting in R Part I - Axis




One of the strong points of R is plotting. It is possible to make highly presentable plots with the help of ggplot2 library. There are many plots available in ggplot2. The aim of this tutorial series more on configuring the different aspects of a plot, like background, grid, axis, title, border and so on. We will se how to make a simple plot and make it into nice looking plot.

The libraries and data set needed.
Library: ggplot2, stringr
Dataset: iris (to be loaded from local drive)
IDE: RStudio

# First load the library
library(ggplot2)

# The set your working directory
setwd("C:/R_Train") # please replace with your own directory

# Read the file and create a data frame
tbl <- read.csv("Sample Data/iris.csv")



Get the csv file  from here.
data source 

For this tutorial Bar chart is used to illustrate the modifications/enhancements you can do.

Base plot 

Code

ggplot(tbl, aes(x=Name, y=SepalLength)) +
   geom_bar(stat="identity")


From the documentation “ Sometimes, bar charts are used not as a distributional summary, but instead of a dotplot. Generally, it's preferable to use a dotplot (see geom_point) as it has a better data-ink ratio. However, if you do want to create this type of plot, you can set y to the value you have calculated, and use stat='identity' ”

Output




The graph is basic and nothing fancy about it. Let’s configure or improve the appearance.

Let’s start with x-axis. First modify the Axis title.

  • Increase the Font size

Code

ggplot(tbl, aes(x=Name, y=SepalLength)) +
   geom_bar(stat="identity") +
   theme(axis.title.x = element_text(size = 20))


Output


  • Change font color

Code

ggplot(tbl, aes(x=Name, y=SepalLength)) +
   geom_bar(stat="identity") +
   theme(axis.title.x = element_text(size = 20, color = "blue"))


Output




  • Change font face

Code

ggplot(tbl, aes(x=Name, y=SepalLength)) +
   geom_bar(stat="identity") +
   theme(axis.title.x = element_text(size = 20, color = "blue", face = "italic"))


Output



Modify the axis text.

The element modified here is axis.text.x
Note the difference between axis.title.x and axis.text.x
Change the font size, color and face

Code

ggplot(tbl, aes(x=Name, y=SepalLength)) +
   geom_bar(stat="identity") +
   theme(axis.title.x = element_text(size = 20, color = "blue", face = "italic")) +
   theme(axis.text.x = element_text(size = 10, color = "firebrick3", face = "bold"))


Output



  • Change the angle
If your axis text is long and overlapping each other, its possible to change the angle.

Code

ggplot(tbl, aes(x=Name, y=SepalLength)) +
   geom_bar(stat="identity") +
   theme(axis.title.x = element_text(size = 20, color = "blue", face = "italic")) +
   theme(axis.text.x = element_text(size = 10, color = "firebrick3", face = "bold", angle = 90,    hjust = 0.5))


hjust = 0.5 to keep the text at the center of the bar.

Output



  • Wrap the text

Other option is to wrap the text. For this we need library stringr. Using the function str_wrap to do this wrapping.

Code

ggplot(tbl, aes(x=Name, y=SepalLength)) +
   geom_bar(stat="identity") +
   theme(axis.title.x = element_text(size = 20, color = "blue", face = "italic")) +
   theme(axis.text.x = element_text(size = 10, color = "firebrick3", face = "bold")) +
   scale_x_discrete(labels = function(x) str_wrap(x, width = 10))


Output




All the actions that has been explained above can be done on for axis y. But there is no axis text.


Code

ggplot(tbl, aes(x=Name, y=SepalLength, fill = Name)) +
   geom_bar(stat="identity") +
   theme(axis.title.x = element_text(size = 20, color = "blue", face = "italic")) +
   theme(axis.text.x = element_text(size = 10, color = "firebrick3", face = "bold")) +
   scale_x_discrete(labels = function(x) str_wrap(x, width = 10)) +
   theme(axis.title.y = element_text(size = 20, color = "blue", face = "italic"))

Output



Modify the scale of y axis

It is possible to manually change the scale.

Code

ggplot(tbl, aes(x=Name, y=SepalLength)) +
   geom_bar(stat="identity") +
   theme(axis.title.x = element_text(size = 20, color = "blue", face = "italic")) +
   theme(axis.text.x = element_text(size = 10, color = "firebrick3", face = "bold")) +
   scale_x_discrete(labels = function(x) str_wrap(x, width = 10)) +
   theme(axis.title.y = element_text(size = 20, color = "blue", face = "italic")) +
   scale_y_continuous(limits = c(0,350), breaks = seq(0,350,50))

Output





  • Modify the bars
    • Change the colors


By adding “fill = Name” in aes you can change the colors

Code

ggplot(tbl, aes(x=Name, y=SepalLength, fill = Name)) +
   geom_bar(stat="identity") +
   theme(axis.title.x = element_text(size = 20, color = "blue", face = "italic")) +
   theme(axis.text.x = element_text(size = 10, color = "firebrick3", face = "bold")) +
   scale_x_discrete(labels = function(x) str_wrap(x, width = 10))

Output




The colors are pre-defined, if you want to change the colors as per your choice the need to do a manual override.

To do that you need to now all the categories in x.
Easily done using unique function

unique(tbl$Name)
[1] "Iris-setosa" "Iris-versicolor" "Iris-virginica"


Code

ggplot(tbl, aes(x=Name, y=SepalLength, fill = Name)) +
   geom_bar(stat="identity") +
   theme(axis.title.x = element_text(size = 20, color = "blue", face = "italic")) +
   theme(axis.text.x = element_text(size = 10, color = "firebrick3", face = "bold")) +
   scale_x_discrete(labels = function(x) str_wrap(x, width = 10)) +
   theme(axis.title.y = element_text(size = 20, color = "blue", face = "italic")) +
   scale_fill_manual("legend", values = c("Iris-setosa" = "indianred3", "Iris-versicolor" =    "lightcyan2", "Iris-virginica" = "darkolivegreen2"))

Output




Change the width of the bars It is easily done using width = in Geom_bar

Code

ggplot(tbl, aes(x=Name, y=SepalLength, fill = Name)) +
   geom_bar(stat="identity", width = 0.3) +
   theme(axis.title.x = element_text(size = 20, color = "blue", face = "italic")) +
   theme(axis.text.x = element_text(size = 10, color = "firebrick3", face = "bold")) +
   scale_x_discrete(labels = function(x) str_wrap(x, width = 10)) +
   theme(axis.title.y = element_text(size = 20, color = "blue", face = "italic")) +
   scale_fill_manual("legend", values = c("Iris-setosa" = "indianred3", "Iris-versicolor" = "lightcyan2", "Iris-virginica" = "darkolivegreen2"))

Output




In the next post I will be discussing further configuration/modification/enhancement you can do with ggplot2.


Update I:
The y axis 0 is not starting at x axis. There is a gap.

Code

ggplot(tbl, aes(x=Name, y=SepalLength, fill = Name)) +
   geom_bar(stat="identity", width = 0.3) +
   theme(axis.title.x = element_text(size = 20, color = "blue", face = "italic")) +
   theme(axis.text.x = element_text(size = 10, color = "firebrick3", face = "bold")) +
   scale_x_discrete(labels = function(x) str_wrap(x, width = 10)) +
   theme(axis.title.y = element_text(size = 20, color = "blue", face = "italic")) +
   scale_y_continuous(expand = c(0, 0), limits = c(0,350), breaks = seq(0,350,50)) +
   scale_fill_manual("legend", values = c("Iris-setosa" = "indianred3", "Iris-versicolor" = "lightcyan2", "Iris-virginica" = "darkolivegreen2"))

Output






Plotting with R: geom_violin with color based on two column values (using GGPLOT2)

In this post i am trying to explain how to color code the ggplot plots automatically when the number of data points is varying. I did this when I had an issue with lack of contrast between the default colors.

I am using the sample data as below



Load the data into R.
The idea is to have 'ID' and 'SUB ID' in X axis and 'Value'  in Y axis and to have different fill colors based on 'ID'.

1. One way to do is to use 'facet_grid' 


I am calling the the table as 'tbl'.
ggplot(tbl, aes(x=SUB.ID, y=Value )) + geom_violin(aes(fill = ID)) + facet_grid(. ~ tbl$ID)

and the result is,






2. But if you want to do something like what you do in JMP, both ID and Sub ID at bottom of the plot then there is a method (long shot).

First create a a new column by using 'paste'.

tbl$ID_SUB_ID <- paste(tbl$ID,tbl$SUB.ID, sep = "_")




Then plot using X=ID_SUB_ID

ggplot(tbl, aes(x=ID_SUB_ID, y=Value )) + geom_violin(aes(fill = ID))




Lets define the colors manually (based on ID).
This is the good option when you doing this automatically and when the number of IDs are varying every time.


The idea is to have a dark and light colors alternatively.
For example. Brown, beige, darkolivegreen, khaki1, midnightblue,magenta, seagreen4, papayawhip


Important to note  that "Aesthetics must be either length 1 or the same as the data". In other words colors need to be defined to all the rows.


Lets create a data.frame with color names.

com <- data.frame(c('brown', 'beige', 'darkolivegreen', 'khaki1', 'midnightblue','magenta', 'seagreen4', 'papayawhip'))
colnames(com)[1] <- 'color'



Lets create an serial number (index) to the table.
com$index <- seq.int(nrow(com))




As the color is based on ID the colors need to be matched against unique ids.

uID <- data.frame(unique(tbl$ID, incomparables = FALSE))
colnames(uID)[1] <- 'ID'

Lets create an serial number (index) to the table.
uID$index <- seq.int(nrow(uID))



Merge the tables to have colors matched to IDs.

ID_color <- merge(uID,com, by = 'index', all.x = TRUE)



Now merge the data table and the color table by ID.



Now plot again..

ggplot(tblc, aes(x=ID_SUB_ID, y=Value)) + geom_violin(aes(fill = color))



Lets remove the legend as it is not what we want.

ggplot(tblc, aes(x=ID_SUB_ID, y=Value)) + geom_violin(aes(fill = color)) + guides(fill=FALSE)



That's it.
I will discuss how to beautify the plot in another post.