R How to find the secondary peak of a distribution

Question

For example, if I generate density with R code

D = c(rnorm(100,1,1), rnorm(100,5,1))

Then the following density will follow:

We can find the primary peak location by

density(D)$x[which(density(D)$y == max(density(D)$y))]

But how to find the secondary peak?

You can find the n'th largest number in R by: sort(x, TRUE)[n] - where x is vector of numbers. — Repmat, Aug 10 '17 at 06:38
@Repmat but the second largest number in density values might not be the value of the second peak. It could be a point next to the first peak which is still larger than the second peak. — John, Aug 10 '17 at 06:43
You're looking for local maxima. The answer to [this question](https://stats.stackexchange.com/questions/30750/finding-local-extrema-of-a-density-function-using-splines) could help. — Alex Firsov, Aug 10 '17 at 06:43
Alternatively, you could write a function with some optimization technique (think Newton's method etc) to identify the maxima in `density(D)$y` and then store them away. Then sort the vector, and voila! — Alex Firsov, Aug 10 '17 at 06:46

rightskewed · Accepted Answer · 2017-08-10T08:04:07.043

You can use mixture models to capture the biomodality

library(flexmix)
set.seed(42)

D <- c(rnorm(100,1,1), rnorm(100,5,1))
kde <- density(D)
m1 <- FLXMRglm(family = "gaussian")
m2 <- FLXMRglm(family = "gaussian")
fit <- flexmix(D ~ 1, data = as.data.frame(D), k = 2, model = list(m1, m2))
c1 <- parameters(fit, component=1)[[1]]
c2 <- parameters(fit, component=2)[[1]]




> c1
                  Comp.1
coef.(Intercept) 1.022880
sigma            1.031319


> c2
                  Comp.2
coef.(Intercept) 4.9042434
sigma            0.9081448

plot(kde)
abline(v=1, col='blue')
abline(v=c1[[1]], lty=2, col='blue')
abline(v=5, col='red')
abline(v=c2[[1]], lty=2, col='red')

Roland · Answer 2 · 2017-08-10T07:45:39.427

If you can assume that you have a mixture of normal distributions, simply use a mixture model:

set.seed(42)
D = c(rnorm(100,1,1), rnorm(100,5,1))

library(mixtools)
mD <- normalmixEM(D)
mD$mu
#[1] 1.079553 4.918794
summary(mD)

plot(mD, which=2)
lines(density(D, "SJ"), lwd = 2)

If you really need the exact peak locations of the combined density function, you have all necessary values available (mixing proportion, means and standard deviations) for calculating the maxima. I don't have time to figure out the maths right now, but it shouldn't be too hard.

R How to find the secondary peak of a distribution

2 Answers2