Questions tagged [scatterplot]

Pairs of (x,y) values plotted as points in Cartesian coordinates. Widely used as an exploratory and diagnostic tool.

A summary of bivariate data where pairs of observations are plotted on an x-y plane (i.e. in Cartesian coordinates), this is a very widely used tool, particularly in data visualization, exploratory data analysis and diagnostics.

274 questions
56
votes
7 answers

Graph for relationship between two ordinal variables

What is an appropriate graph to illustrate the relationship between two ordinal variables? A few options I can think of: Scatter plot with added random jitter to stop points hiding each other. Apparently a standard graphic - Minitab calls this an…
Silverfish
  • 20,678
  • 23
  • 92
  • 180
48
votes
6 answers

How do I avoid overlapping labels in an R plot?

I'm trying to label a pretty simple scatterplot in R. This is what I use: plot(SI, TI) text(SI, TI, Name, pos=4, cex=0.7) The result is mediocre, as you can see (click to enlarge): I tried to compensate for this using the textxy function, but it's…
slhck
  • 787
  • 2
  • 8
  • 20
39
votes
9 answers

What is the relationship between $Y$ and $X$ in this plot?

What is the relationship between $Y$ and $X$ in the following plot? In my view there is negative linear relationship, But because we have a lot of outliers, the relationship is very weak. Am I right? I want to learn how can we explain…
PSS
  • 773
  • 3
  • 9
  • 14
36
votes
3 answers

How to draw neat polygons around scatterplot regions in ggplot2

How do I add a neat polygon around a group of points on a scatterplot? I am using ggplot2 but am disappointed with the results of geom_polygon. The dataset is over there, as a tab-delimited text file. The graph below shows two measures of attitudes…
Fr.
  • 1,343
  • 3
  • 11
  • 22
30
votes
3 answers

What's a good way to use R to make a scatterplot that separates the data by treatment?

I'm very new with R and stats in general, but I need to make a scatterplot that I think might be beyond its native capacities. I have a couple of vectors of observations and I want to make a scatterplot with them, and each pair falls into one out of…
crf
  • 479
  • 1
  • 7
  • 14
25
votes
2 answers

What does an Added Variable Plot (Partial Regression Plot) explain in a multiple regression?

I have a model of Movies dataset and I used the regression: model <- lm(imdbVotes ~ imdbRating + tomatoRating + tomatoUserReviews+ I(genre1 ** 3.0) +I(genre2 ** 2.0)+I(genre3 ** 1.0), data = movies) library(ggplot2) res <- qplot(fitted(model),…
23
votes
2 answers

Scatterplot with contour/heat overlay

I saw this plot in the supplement of a recent paper and I'd love to be able to reproduce it using R. It's a scatterplot, but to fix the overplotting there are contour lines that are "heat" colored blue to red corresponding to the overplotting…
Stephen Turner
  • 4,183
  • 8
  • 27
  • 33
22
votes
2 answers

Good online resource with tips on graphing association between two numeric variables under various conditions

Context: Over the while I've acquired a set of heuristics on how to effectively plot the association between two numeric variables. I imagine most people who work with data would have a similar set of rules. Examples of such rules might be: If one…
Jeromy Anglim
  • 42,044
  • 23
  • 146
  • 250
15
votes
1 answer

Assumptions of generalised linear model

I have made a generalised linear model with a single response variable (continuous/normally distributed) and 4 explanatory variables (3 of which are factors and the fourth is an integer). I have used a Gaussian error distribution with an identity…
luciano
  • 12,197
  • 30
  • 87
  • 119
14
votes
1 answer

Why jitter continuous value in a scatterplot?

I'm using Orange Canvas and I generated a scatter plot. I have the possibility to jitter continuous variables but I really don't know why would I do that. What's the idea behind jittering?
Pierre
  • 412
  • 5
  • 16
13
votes
2 answers

Is there any statistical reason for diagonal lines in scatterplot on a log scale?

I'm perplexed by some vertical lines that show up in these scatter plots on a log scale. Population is on the y-axis and the proportion of the neighborhood with the attribute mentioned in the panel label on the x-axis. Is this just an artifact of…
Tom
  • 1,511
  • 1
  • 12
  • 17
13
votes
2 answers

How to plot binary (presence/absence - 1/0) data against continuous variables

I am trying to plot presence/absence (1/0) of a sample species against various environmental variables. I have put presence/absence on the y-axis and the environmental variable (in this case barometric pressure) on the x axis, however the resulting…
Disco
  • 151
  • 1
  • 1
  • 6
12
votes
5 answers

How do I interpret this Scatter Plot?

I have a scatter plot which has sample size which is equal to the number of people on x axis and median salary on y axis, I am trying to find out if the sample size has any effect on the median salary. This is the plot: How do I interpret this plot…
Sameed
  • 415
  • 1
  • 4
  • 10
11
votes
1 answer

Getting different results when plotting 95% CI ellipses with ggplot or the ellipse package

I want to visualize the results of a clustering (produced with protoclust{protoclust}) by creating scater plots for each pair of variables used for classifying my data, colouring by classes and overlapping the ellipses for the 95% confidence…
josetanago
  • 113
  • 1
  • 1
  • 5
11
votes
3 answers

How to extract information from a scatterplot matrix when you have large N, discrete data, & many variables?

I'm playing around with the breast cancer dataset and created a scatterplot of all attributes to get an idea for which ones have the most effect on predicting the class malignant (blue) of benign (red). I understand that the row represents x axis…
birdy
  • 481
  • 8
  • 14
1
2 3
18 19