3

Context: say I am trying to determine the concentration of a chemical, so I take known concentrations of the chemical and make a standard curve (6 measurements per standard) then measure my unknown 6 times.

Here's what I can do: I can use linear regression to determine confidence limits for the concentration of each independent read of my unknown sample (by using a selection variable and saving predictions from the REGRESSION command, but that's not what I want. I want a single estimate (with std. error or CI bounds) of my single unknown sample that accounts for both a) the error in my regression model built from my standard concentrations and b) the error in my measurement from my one unknown sample.

How can I do this in SPSS? If I try to save predicted values from GENLINMIXED, MIXED, or GLM, I get either no predicted values for my unknown (because no dependent value was listed) or a unique estimate and error for each replicate measurement of my unknown (when I want the group estimate, not the estimate for each replicate).

Here are some example data I made up. Depending on the analysis you want to try, it might be necessary to run VARSTOCASES first. I don't want to make this a tall question, so I'm submitting it in this format:

data list list /conc read1 read2 read3 read4 read5 read6 stdCurve.
begin data
0   .00446  .00515  .00450  .00519  .00500  .00492  1
100 .10054  .10484  .10086  .10877  .10293  .10747  1
200 .20695  .20083  .21797  .21949  .18936  .19672  1
300 .32355  .31071  .30802  .30414  .30014  .26003  1
400 .42793  .40888  .40227  .41009  .39880  .39879  1
500 .47858  .47810  .55102  .49355  .51650  .46561  1
600 .66123  .62981  .62180  .54510  .67363  .65373  1
700 .65611  .70905  .74126  .71843  .69953  .77222  1
800 .86298  .75166  .86441  .77430  .82915  .85193  1
900 .92009  .94197  .92889  .91114  .79323  .93604  1
1000    .90955  1.00724 1.02682 .95047  1.03176 1.16755 1
.   .56990  .55395  .58641  .51932  .59506  .55967  0
end data.
kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
DocBuckets
  • 1,733
  • 1
  • 11
  • 13
  • 1
    I don't feel it is right to call it repeated-measurements. Your 6 READ levels don't seem to differ in some systematic way (e.g. condition). Rather, you just performed 6 assessment attempts instead of 1 - for precision. Right? – ttnphns May 23 '12 at 05:38
  • Correct. The only variation in them would come from random variation in the measurement itself. I should point out that these data are _completely simulated_. In reality, the "unknown" sample might have actually been measured from independent collections or something to that effect. Either way, I still want to know the same thing: How does one infer an unknown dependent when both the model-building data and the unknown inputs have uncertainty? Can this be done rigorously at all? in SPSS? – DocBuckets May 23 '12 at 21:26
  • 1
    It sounds to me that you use `concentration` as DV and `read` values as IV. (Am I correct?) This way you naturally get several varying prediction values for the DV. Whereas your calibration `concentration` values are true (error free) and should be the IV. – ttnphns May 24 '12 at 07:30
  • Also correct. However, in order to have SPSS predict a value of concentration for my unknown, I need to input my read values as independent. I don't know how to have SPSS use a regression model to go from a measured dependent (read) to an estimated independent (concentration) with with error. – DocBuckets May 24 '12 at 17:12
  • Predict `read` by `concentration`. Estimate and plot upper and lower confidence bounds around the regression line. Compute the mean `read` for the unknown concentration sample. Intersect this line with the aforementioned confidence bounds and project the two points of intersection on concentration axis. That will be the bounds for concentration of your sample. In your example data, however, there is the problem of heteroscedasticity (cloud is fan-like shape) and hence the usual OLS confidence interval is inappropriate. – ttnphns May 24 '12 at 20:31

1 Answers1

2

I found my answer here: I. Lavagnini, F. Magno, A statistical overview on univariate calibration, inverse regression, and detection limits: Application to gas chromatography/mass spectrometry technique., Mass spectrometry reviews 26, 1-18 (2007).

Specifically, my case requires the use of WLS instead of OLS regression and using the inverse regression technique involving a t-distribution of the input dependent variable (the authors describe it as the "third technique"). Since SPSS doesn't do well with storing computational variables, I wrote an addon that computes all of the necessary terms and outputs new variables with the expected dependent value and 95% CI. One thing this paper does not address is the solution to the intersection point of the confidence bands from regression and the distribution of unknown dependent variables- it's a quadratic. This means that there are two values possible for each bounds of the CI. So far, my simulated data and real data have shown miniscule differences between the two solutions, so I will just be taking the mean.

I am leaving this here as a resource for future people. If anyone wants my SPSS addon, let me know.

DocBuckets
  • 1,733
  • 1
  • 11
  • 13