2017-01-19

Basic graphs

Scatter plots

-A scatterplot consists of an X axis (the horizontal axis), a Y axis (the vertical axis), and a series of dots. Each dot on the scatterplot represents one observation from a data set. The position of the dot on the scatterplot represents its X and Y values.

plot function

  • ?plot

  • Syntax

plot(x, y, type = , xlab = , ylab = , xlim = ,ylim = , col = , main = , ...)

Arguments

  • x, y
    • provide the x and y coordinates for the plot.
  • type
    • "p": points;
    • "l": lines;
    • "b": both points and lines;
    • "c": empty points joined by lines;
    • "o": overplotted points and lines;
    • "s" and "S": stair steps;
    • "h": histogram-like vertical lines.
  • main
    • an overall title for the plot
  • sub
    • a sub title for the plot
  • xlab
    • a title for the x axis
  • ylab
    • a title for the y axis
  • xlim
    • range of x axis
  • ylim
    • range of y axis

Examples:

par(mfrow=c(2,2))  ##  multiple graphs on the same page par(mfrow = c(nrows, ncols))
plot(cars$speed,cars$dist)
plot(cars$speed,cars$dist,col="red")
plot(cars$speed,cars$dist, xlab = "speed", ylab="stopping distance", main="speed vs distance")
plot(cars$speed,cars$dist, xlab = "speed", ylab="stopping distance", main="speed vs distance",xlim = c(0,50),ylim=c(0,200),sub="Here is a subtitle")

  • The different plot types
x <- 0:12
y <- sin(pi/5 * x)
op <- par(mfrow = c(3,3), mar = .1+ c(2,2,3,1))
for (tp in c("p","l","b",  "c","o","h",  "s","S","n")) {
   plot(y ~ x, type = tp, main = paste0("plot(*, type = \"", tp, "\")"))
   if(tp == "S") {
      lines(x, y, type = "s", col = "red", lty = 2)
      mtext("lines(*, type = \"s\", ...)", col = "red", cex = 0.8)
   }
}

  • pch , lty

Bar graphs

  • A bar graph of a qualitative data sample consists of vertical parallel bars that shows the frequency distribution graphically.

  • Syntax
barplot(height, width = , space = , names.arg = , main=, xlab = ,ylab=,...)

Arguments

  • height : either a vector or matrix of values describing the bars which make up the plot.
  • width : bar widths
  • names.arg : a vector of names to be plotted below each bar or group of bars.
  • beside : a logical value.If FALSE, stached bars; if TRUE, bars are drawn horizontally
  • main, xlab, ylab, xlim, sub, col…
  • ?barplot

-Example

data = c("a","b","a","c","a","b","c","c","c","c")
counts = table(data)
counts
## data
## a b c 
## 3 2 5
barplot(counts) 

barplot(counts,width=c(3,2,1), names.arg=c("apple","orange","pear"),col=c("red","orange","yellow"))

  • When height argument is a matrix
data = matrix(1:9,3,3)
colnames(data) = c("apple","orange","pear")
rownames(data) = c("big","medium","small")
data
##        apple orange pear
## big        1      4    7
## medium     2      5    8
## small      3      6    9
barplot(data, beside = F,main="Stacked", legend.text = T,args.legend = list(x = "topleft"),col=c("red","orange","yellow"))

barplot(data, beside = T, main = "Side by side",legend.text = T,args.legend = list(x = "topleft"),col=c("red","orange","yellow"))

Histogram

-Like a bar chart, a histogram is made up of columns plotted on a graph. Usually, there is no space between adjacent columns.

  • Syntax
hist(x, breaks, freq, right, main, xlab, ylab, xlim, ylim,...)

Arguments

  • x : a vector of values
  • breaks : how to set breakpoints
  • freq : logical; if TRUE, show frequencies; if FALSE, probability densities
  • right : logical; if TRUE, the histogram cells are right-closed (left open) intervals.
  • ?hist

-Example

data = c(2,4,5,7,11,12,15,16,17,18,19,20)
hist(data)

  • break points
data = rnorm(1000,20,2)
hist(data,main="default")

hist(data,breaks = 20, main="20 breaks" )

hist(data, breaks = c(0,10,15,18:22,25,30), main = "Customize the breakpoints")

Boxplot

  • The box plot is a standardized way of displaying the distribution of data based on the five number summary: minimum, first quartile, median, third quartile, and maximum.

  • Syntax
boxplot(x, data = , main = , xlab= ,ylab = ,...)

Arguments

  • x : a vector or a formular e.g. y ~ group
  • data : a data frame providing the data
  • range : determines how far the plot whiskers extend out from the box
  • ?boxplot

-Example

  • for individual variable
boxplot(mtcars$mpg, ylab="Miles Per Gallon")

  • for variables by group
boxplot(mpg~cyl,data=mtcars, main="Car Milage Data", 
    xlab="Number of Cylinders", ylab="Miles Per Gallon")

Plot a function curve

  • Draws a curve corresponding to a function over the interval

  • curve function -Syntax

curve(expr, from = , to = ,xlim = ,xlab = ,ylab = , main = ,...)

Arguments

  • expr : The name of a function, or a call or an expression written as a function of x
  • from, to : he range over which the function will be plotted
  • add : logical; if TRUE add to an already existing plot
  • Examples

curve(cos, from = -3*pi, to = 3*pi)  ## a function

curve(x^3,from = -5,to=5) ## an expression

  • normal distribution
curve(dnorm(x,mean=5,sd=2), from = -2, to=12, main="Use dnorm() function") ## a function

curve(1/(sqrt(2*pi)*2)*exp(-(x-5)^2/(2*4)),from = -2, to = 12, main ="Use an expression")

Enhance the graphs

Add text annotations

  • add text to a plot -Syntax
text(x,y,labels,adj...)
  • x,y : coordinates where the text labels should be written.
  • labels : a character vector or expression specifying the text to be written
  • adj : adjustment of the labels
  • … -Example
x = y = 1:10
plot(x,y)
text(x,y+0.4,labels= letters[1:10],col="red")

Add lines and points

abline function

  • adds one or more straight lines through the current plot
  • Syntax
abline(a = , b = , h = , v = ,  ...)
  • a, b: the intercept and slope, single values.
  • h : horizontal line
  • v : vertical line

  • Example

## set up a coordinate system
plot(c(-2,3), c(-1,5), type = "n", xlab = "x", ylab = "y",asp=1)

abline(v = 0, col = "red") ## add a vertical line at x = 0
text(0,4,"abline(v = 0)",adj = 1)
abline(h = 0, col = "blue") ## add a horizontal line at y = 0
text(1,0, "abline( h = 0 )",adj = c(0, -.1))
abline(a = 1, b = 2, col = "green")
text(1,3, "abline( 1, 2 )", adj = c(-.1, -.1))

l <- lm(dist ~ speed, data = cars)
plot(cars) ## draw the scatterplot first
abline(l,col="red") ## add a regression line

curve function with add = TRUE

  • Same syntax as before, with the argument add = TRUE

  • Example

samples = rnorm(1000,mean = 5,sd = 2)
hist(samples, probability = T)
curve(dnorm(x, mean=5,sd = 2),col="blue",add=TRUE)

pionts function

  • add points to a plot
  • Syntax
points(x,y, col = , pch = ,...)
  • x, y : coordinate vectors of points to plot.
  • pch : type of points

  • Example

plot(cars,type="n")
carsLong = cars[cars$dist>=70,]
carsMid = cars[cars$dist>=30 & cars$dist<70,]
carsShort = cars[cars$dist<30,]
points(carsLong, col="red",pch=19)
points(carsMid, col="green",pch=19)
points(carsShort, col="blue",pch=19)

## add legends
legend("topleft",legend = c("long","mid","short"),col = c("red","green","blue"),pch = 19,border=NA)

An example

x = -5:5
y = x^2 + rnorm(length(x))
plot(x,y,pch=6,col="red", main = "Main title", sub = "subtitle")
curve(x^2,from = -5, to = 5,lty = 3,col="blue",add=T)


abline(h = 10,col="green")
abline(v = 0,col = "yellow")
abline(5, 2,col="purple",lty=2) 
text(-2, 20, "write text here") 

legend(3,3,"Legend here",pch = 6,col="red")

Interactive graphs

  • identify : identify points
identify(x,y,labels,...)
  • locator : add points or lines
locator(type)
  • Using packages (rCharts, iPlot,…)
library(rCharts)
n1 = nPlot(wt~mpg | cyl ,data = mtcars, color = "cyl",type="scatterChart")
n1$save('rchart.html', cdn = TRUE)
n1$show('inline', include_assets = TRUE, standalone = TRUE)