1

I have a set of data referring to the sales that a company had in the past 90 days.I need to determine the distribution that this data may come from in order to use it in R. If I know the distribution, I can use different functions to generate numbers that respect that distribution (for example, rrnorm or rrpois) and to simulate the stocks(how many products of that type) that this company will own in the future, after a certain amount of time. My data looks like this:

  • day 1:30
  • day 2:32
  • day 3:35
  • day 4:37
  • day 5:36
  • day 6:32
  • day 7:28
  • day 8:35
  • day 9:33
  • day 10:30

And so on. More specifically, I need to find what theoretical distribution the sales follow and use that as my demand distribution in a stock.inventory management simulation. I only need the way that I could do this, not the actual result, because I cannot give you all the data I have, because it`s confidential. Could someone please help me?

Andreea A
  • 21
  • 3

2 Answers2

2

In general, to simulate from data, you don't necessarily need to/should try to find an approximating parameterised distribution. There might none exist. Even if the data is generated from a closed-form distribution (I mean one that could be explicitly written down if somebody knew what it was), I'm not sure that there exists an exhaustive list of all described probability distributions that you could use to search for it.

One can use random sampling with replacement from the actual data (i.e. taking the empirical distribution as visualised by the histogram as the underlying distribution), bootstrapping.

Bence Mélykúti
  • 445
  • 1
  • 3
  • 10
0

Capturing seasonality in multiple regression for daily data would be a good place to start. Essentially when you form a predictive model you can forecast/simulate future periods. The simulation is based upon a monte-carlo ... bootstrap using the model residuals to give you a distribution of values for each period in the future.

Simple method of forecasting number of guests given current and historical data contains a discussion of simulation/forecasting of daily data ( the number of people having lunch )

With 90 days of data your model might be quite simple incorporating ARIMA , daily effects , outliers , level shifts etc.

IrishStat
  • 27,906
  • 5
  • 29
  • 55
  • I do not need to forecast it, I need to generate those numbers that follow the certain distribution and this distribution is the one I need to find. – Andreea A Mar 19 '18 at 10:39
  • That is precisely what a forecast is . It is a simulation into the future, If you create 90 values I will demonstrate this. – IrishStat Mar 19 '18 at 10:41
  • I will simulate the stock, the number of products that remain in stock after they sell the products based on the distribution of the demand (I assume the demand follows the same distribution of the sales). – Andreea A Mar 19 '18 at 10:43
  • I made a simulation model of stock/inventory mangement in which I use a Poisson distribution to see the total cost, but now I need the real distribution of my data set. – Andreea A Mar 19 '18 at 10:45
  • 1
    The "distribution" that you are looking for may depend on the day-of-the-week THUS there would be a family if distributions ..one for each day of the week. Care would have to be taken when generating these distributions/conditional forecasts to reflect anomalies and level shifts et al – IrishStat Mar 19 '18 at 10:46
  • The distribution of the composite would be useless if there is auto-correlation in the data of any form. – IrishStat Mar 19 '18 at 10:49