1

I want to use Python to predict a value of a chemical reaction. As an input I have time units (0,2,4..) and the concentrations of 2 solutions. As an output I have a chemical measurement.

As an example of inputs that I got :

   time concentration1 concentration2    result

    0       100             0             -> 123
    2       100             0             -> 100

example2

    0        95             5              -> 321
    2        95             5              -> 300

I tried linear regression and polynomial regression but the prediction is biased. This is because time depends on the other variables.

If time would be independent then for if the concentrations will be 0 and time would be for example 5 I would still get an output value != 0.

My questions is how should I process this data? What type of regression should I use?

O.Rares
  • 111
  • 1
  • Can you please clarify what you mean by "time depends on the other variables"? What other variables? How does time depend on them? Are those variables observed? (Time is typically thought of as being independent of everything, so maybe you mean that the effect of concentration levels on your output measurement varies by level of time?) – AlexK Apr 16 '19 at 20:21
  • @AlexK I said that it is dependent because when I try to give to the other variables value 0 and to give time a non zero value I would still get a result ( based on linear/ polynomial regression). The other variables are the concentrations. The output value varies with time for the same concentrations. – O.Rares Apr 16 '19 at 20:24
  • Also, if you are interested in obtaining regression output of 0 when concentrations are 0, regardless of the value of time, you may need to construct a regression through the origin, i.e., regression without an intercept term. See this Q&A, for example: https://stats.stackexchange.com/q/11064/241093 – AlexK Apr 16 '19 at 20:25
  • So then the question becomes, why do you need to include time as an independent variable at all? Do you hypothesize that output measurements are affected by how much has elapsed, regardless of concentration levels? This is all just a matter of specifying the model equation based on how you think variables affect the outcome. – AlexK Apr 16 '19 at 20:30
  • @AlexK Yes, the output is affected by the time elapsed. Different measurements were made for different concentrations so the model should predict a value based on time elapsed and 2 concentrations – O.Rares Apr 16 '19 at 20:38

0 Answers0