A study gives the following:
$n = 67$
mean = 73
sd = 68
median = 55
IQR = 66
Is it possible from this information to get the actual Q1 and Q3 values? I used the $n$, mean & sd to get 95% CI. Should that be roughly similar?
A study gives the following:
$n = 67$
mean = 73
sd = 68
median = 55
IQR = 66
Is it possible from this information to get the actual Q1 and Q3 values? I used the $n$, mean & sd to get 95% CI. Should that be roughly similar?
As @Dave noted in a comment, you would have to make some assumptions about the distribution. Given the mean and the median being so different, it's likely that there is substantial skew - and you confirm this in a comment.
Various assumptions might be reasonable.
With median = 55 and IQR = 66 (and no other info or assumptions), then, with a symmetric distribution, you would have 22 and 88 for the quartiles. But you could have anything from -10 and 56 to 54 and 120. But you have additional info: The mean and sd - these will limit the possibilities. And you probably also can figure out some things from the nature of the variable (e.g. is it always positive?) and try various distributions.
You should have given some context, what (real-life) variable $x$ do your data represent? Some questions you probably know answers for:
What is the possible range for $x$? That is, is $x$ nonnegative? or a count? ...
Can we suppose independence?
Nevertheless, some observations:
the mean is larger than the median, and a 95% confidence interval for the mean based on normal distributions give about $( 56.4, 89.6)$, the observed median is just outside. So the data casts doubt on symmetry, and points to a right-skewed distribution.
The observed mean and standard deviation are close, pointing to an exponential (or more generally gamma) distribution.
One can also get a rather close fit with a lognormal distribution, I get that $\mu=4, \sigma=0.778$ is close. One could also try normal or skew-normal distributions. As soon as you decide to try some distributional family as a model, you can use the given descriptive statistics to find moment-type estimators.
and given those estimators, it is now easy to calculate the quartiles.
Can we say something more? Maybe trying to compare some such models? I doubt normal or skew-normal models can give a good fit, let us try the gamma and lognormal models. We can simulate data from such models, and try abc-methods (approximate bayes computations) to compare them. Some details here: How to do estimation, when only summary statistics are available?
One distribution that fits these parameters is a mixture of:
So the quartiles could be at those point masses, though there are many other possibilities also.
I found this by solving some equations; here is the explanation:
The median of the lognormal is $55$, so the median of the mixture is also $55$.
The mean of the lognormal is $81.58$, so the mean of the mixture is $$(25\%)31.38+(50\%)81.58+(25\%)97.38=73.$$
The point masses are at roughly the $26^{th}$ and $74^{th}$ percentiles of the lognormal, so they are at the $13^{th}-38^{th}$ and $62^{nd}-87^{th}$ percentiles of the mixture. In particular, they are the quartiles of the mixture and the IQR is $66$.
The second moment of the lognormal is $14643$, so the variance of the mixture is $$(25\%)(31.38-73)^2+(50\%)(14643 - 2(73)(81.58)+73^2)+(25\%)(97.38-73)^2=68^2.$$
The final mixture is reasonably easy to understand, and you could tinker with it to get smaller point masses, an $n=67$ dataset with the same properties, or other possibilities.