3

I found a similar question in this forum. As a rule of thumb, since there are 4 independent variables in my case, I need 4*10=40 data points. However, my question differs slightly, since I want to ask about data generation too. I want to develop a nonlinear regression equation as: $$y=c_{1} A^k B^l C^m D^n$$ by estimating regression parameters: $$c_{1},k, l, m, n$$ To simplify into linear regression, taking log $$\log(y)=log(c_{1})+ klog(A)+llog(B)+m log(C)+nlog(D)$$ where y=dependent variable and A,B,C,D are independent variables. The independent variables vary as follows: $$A=0.01-0.6$$ $$B=50-1000$$ $$C=0.001-10$$ $$D=0.1-10$$ What I can do is, for creating data points, I can take any arbitrary value of 3 out of 4 independent variables from this above range and vary the 4th one using arbitrary interval, and simulate each different value of "y" using my model. I can do this for each variable. But there can be an infinite number of combinations. What is the best way of creating data points of "y" by varying variables: A, B, C, D and how many points would be appropriate in this case?

In other words, specifying, Margin of error at 99% confidence for each parameter, and how to choose my data points (A,B,C,D) within the range to sample such that the overall sample size is minimized? Note: There can be an infinite number of combinations of A, B, C, D using arbitrary intervals for each independent variable within their individual range.

maxm
  • 31
  • 2
  • First you say you want to estimate the parameters, but then you sound as if you know them? Do I miss something? – ttnphns Jun 21 '14 at 07:19
  • I do not know the regression parameters (c1,k,l,m,n), I want to estimate them, but first I have to create data set for the regression analysis. The range given are for the independent variables (A, B, C, D), which are known. I do not know a better way of creating data set, since using arbitrary combinations of these 4 variables will result in very large (infinite) number of combinations. – maxm Jun 21 '14 at 07:49
  • So, is the idea that you want to specify some level of accuracy for each parameter, and then choose your data points to sample such that the overall sample size (or cost) is minimised? – probabilityislogic Jun 21 '14 at 07:56
  • Yes, that's right. Thank you for nicely putting the question. I edited my question as commented. – maxm Jun 21 '14 at 08:14
  • I would start by trying to use the delta rule, and consider that you are trying to determine variances. – EngrStudent Jun 21 '14 at 17:35

0 Answers0