TEDSF Interview Skills Q&A Platform
1 like 0 dislike
17 views
How will you create scatterplot matrices in R language?
in Data Science by Platinum (104k points) | 17 views

1 Answer

0 like 0 dislike
Best answer
  1. Launch RStudio as described here: Running RStudio and setting up your working directory

  2. Prepare your data as described here: Best practices for preparing your data and save it in an external .txt tab or .csv files

  3. Import your data into R 

Data

iris data is used in the following examples. iris data set gives the measurements in centimeters of the variables sepal length and width, and petal length and width, respectively, for 50 flowers from each of 3 species of iris. The species are Iris setosa, versicolor, and virginica.

head(iris)

 

R base scatter plot matrices: pairs()

  • Basic plots:
pairs(iris[,1:4], pch = 19)

 

  • Show only upper panel:
pairs(iris[,1:4], pch = 19, lower.panel = NULL)

 

  • Color points by groups (species)
my_cols <- c("#00AFBB", "#E7B800", "#FC4E07")  
pairs(iris[,1:4], pch = 19,  cex = 0.5,
      col = my_cols[iris$Species],
      lower.panel=NULL)

 

  • Add correlations on the lower panels: The size of the text is proportional to the correlations.
# Correlation panel
panel.cor <- function(x, y){
    usr <- par("usr"); on.exit(par(usr))
    par(usr = c(0, 1, 0, 1))
    r <- round(cor(x, y), digits=2)
    txt <- paste0("R = ", r)
    cex.cor <- 0.8/strwidth(txt)
    text(0.5, 0.5, txt, cex = cex.cor * r)
}
# Customize upper panel
upper.panel<-function(x, y){
  points(x,y, pch = 19, col = my_cols[iris$Species])
}
# Create the plots
pairs(iris[,1:4], 
      lower.panel = panel.cor,
      upper.panel = upper.panel)

 

  • Add correlations on the scatter plots:
# Customize upper panel
upper.panel<-function(x, y){
  points(x,y, pch=19, col=c("red", "green3", "blue")[iris$Species])
  r <- round(cor(x, y), digits=2)
  txt <- paste0("R = ", r)
  usr <- par("usr"); on.exit(par(usr))
  par(usr = c(0, 1, 0, 1))
  text(0.5, 0.9, txt)
}
pairs(iris[,1:4], lower.panel = NULL, 
      upper.panel = upper.panel)

 

Use the R package psych

The function pairs.panels [in psych package] can be also used to create a scatter plot of matrices, with bivariate scatter plots below the diagonal, histograms on the diagonal, and the Pearson correlation above the diagonal.

library(psych)
pairs.panels(iris[,-5], 
             method = "pearson", # correlation method
             hist.col = "#00AFBB",
             density = TRUE,  # show density plots
             ellipses = TRUE # show correlation ellipses
             )

 

 

by Platinum (104k points)

No related questions found

Welcome to TEDSF Skills Questions and Answers, a platform, where you can ask skills questions and receive answers from other members of the community. On TEDSF the youth, students, teachers, policy makers and enthusiasts can ask and answer any questions. Get help and answers to any skills-related problem including mathematics, computer science, data science, web development, physics, chemistry, digital marketing, African development and more. Help is always 100% free!

4.1k questions

1.4k answers

64 comments

29.6k users

4,058 questions
1,448 answers
64 comments
29,583 users