I pasted the data here: The dataframe contains multiple observations on x and y per country. Each country is also part of a region.
Based on this post, I managed to draw ploygons/clusters in the scatterplot using ggplot based on the same factor as the colors of my points are based on (i.e., country). Here's the code I used:
find_hull <- function(df) df[chull(df$x, df$y), ]
hulls <- ddply(df, "country_name", find_hull)
plot <- ggplot(data = df, aes(x = x, y = y, colour=country_name, fill=country_name)) +
geom_point() +
geom_polygon(data = hulls, alpha = 0.5)
plot
But what if I want to draw the polygons based on region, and still have the color assigned by country? Just changing country_name
to region
when ddplying the find_hull
function did not produce satisfying results.
I have the feeling it's because I do not fully understand yet what the chull
function does, but I didn't manage to wrap my head around it.