1

In my thesis, I am trying to perform a Monte Carlo simulation with a set of parameters, where I take a random value from a known distribution to calculate a singe run of the simulation.

However, for several parameters of the analysis, I was not given the sample data, but only this information:

        vars    n       mean    sd      median  trimmed mad     min     max         skew    kurtosis    se
DATA    3       176999  49,04   71,67   26,12   36,03   11,28   0,02    2411,33     8,21    98,56       0,17

From what I was told, this is a set of descriptive values of the sample that was obtained through psych R package.

The problem is that I don't know anything else about this data. From what I researched, there are methods (implemented in MATLAB, which I am using) to obtain a random sample from these data such as Pearson and Johnson Systems.

My idea was to use these random data to generate a PDF, and from there get random values for the MC simul.

I ran the pearsrnd MATLAB function with the data, and for some parameters I got some results associated with the Gamma function, but for others only obtained NaNs.

My question is, how can I extract more info from these small info I was given, which methods to use for that.

Note that my knowledge of probabilities and statistics is very very limited, I am only getting used to these concepts right now.

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
  • 1
    What does "vars=3" mean? Is this multivariate data? Regardless, there isn't enough distribution here to pin down a single distribution. – Adrian May 17 '16 at 10:34
  • @Adrian I don't know what vars = 3 means. This is in fact all the information I was given. – MobileCushion May 17 '16 at 11:02
  • 1
    I meant "enough _information_ here to pin down a single distribution", by the way – Adrian May 17 '16 at 11:38
  • 2
    Related: [How to do estimation, when only summary statistics are available?](http://stats.stackexchange.com/q/37729/115908) (you still require a parametric model, but I guess there are some minimal requirements to fit a model given only a set of summary statistics). – Rod May 17 '16 at 19:53
  • 2
    I think that the best way here (in sense of using all given information and not adding additional assumptions) is to use [Maximum entropy principle](https://web.stanford.edu/class/stats311/Lectures/lec-07.pdf) with constrains defined by all descriptive statistics that are given. Because analytical solution of it seems to be too complicated (if even possible) it's better to search for discretized version of the distribution on some interval with constant step using numerical optimization techniques and all given descriptive statistics as equality constrains – Alexander Rodin May 18 '16 at 12:28

0 Answers0