May 12, 2021 R language tutorial
Scatter charts show many of the points drawn in the Descartes plane. E
ach point represents the value of two variables.
Select one variable on the horizontal axis and another on the vertical axis.
Use thelot() function to create a simple scatterplot.
The basic syntax for creating scatterpics in the R language is -
plot(x, y, main, xlab, ylab, xlim, ylim, axes)
The following is a description of the parameters used -
x is a dataset whose value is horizontal coordinates.
y is a dataset whose value is vertical coordinates.
Main is a block of graphics.
xlab is a label on the horizontal axis.
ylab is a label on the vertical axis.
xlim is the limit of the value of the x used for drawing.
ylim is the limit of the value of y used for drawing.
Axes indicate whether two axes should be drawn on the drawing.
We use the dataset "mtcars" available in the R-language environment to create a basic scatterpic. L et's use the "wt" and "mpg" columns in mtcars.
input <- mtcars[,c('wt','mpg')] print(head(input))
When we execute the code above, it produces the following results -
wt mpg Mazda RX4 2.620 21.0 Mazda RX4 Wag 2.875 21.0 Datsun 710 2.320 22.8 Hornet 4 Drive 3.215 21.4 Hornet Sportabout 3.440 18.7 Valiant 3.460 18.1
The following script creates a scatterpic of the relationship between wt (weight) and mpg (mile/gallon).
# Get the input values. input <- mtcars[,c('wt','mpg')] # Give the chart file a name. png(file = "scatterplot.png") # Plot the chart for cars with weight between 2.5 to 5 and mileage between 15 and 30. plot(x = input$wt,y = input$mpg, xlab = "Weight", ylab = "Milage", xlim = c(2.5,5), ylim = c(15,30), main = "Weight vs Milage" ) # Save the file. dev.off()
When we execute the code above, it produces the following results -
When we have more than two variables, we want to find a correlation between one variable and the rest, and we use a scatterpic matrix. W e use the pairs() function to create a matrix of scatterpics.
The basic syntax for creating a scatterpic matrix in R is -
pairs(formula, data)
The following is a description of the parameters used -
Formula represents a series of variables used in pairs.
Data represents the dataset from which the variable will be obtained.
Each variable is paired with each remaining variable. S catter plots are drawn for each pair.
# Give the chart file a name. png(file = "scatterplot_matrices.png") # Plot the matrices between 4 variables giving 12 plots. # One variable with 3 others and total 4 variables. pairs(~wt+mpg+disp+cyl,data = mtcars, main = "Scatterplot Matrix") # Save the file. dev.off()
When we execute the code above, we get the following output.