4

I'm wondering how can machine learning approach solves a problem which has some restrictions.

Let's say we have a demand prediction problem (regression) and the demand must be less or equal than 50. Therefore, the outputs of the machine must be less or equal than 50.

In this situation, how can I keep the constraint (demand <= 50) in machine learning algorithm? The question also includes how to keep integer, equality and inequality constraints.

I think I can use a lagrangian multiplier, but I'm not sure. Can I include the constraints in the loss function of the machine?

Yoo Inhyeok
  • 161
  • 8
  • I'd sure go for a personalized loss function. however, there is more to say if we knew better your problem and your data. – carlo Nov 08 '19 at 09:28
  • @carlo There are no data. I asked it just curious. Can you explain how can I make my own loss function to me more details? – Yoo Inhyeok Nov 08 '19 at 09:31
  • @carlo For examples, demand must be positive numbers because it can never be negative. However, a linear regression line can result in negative demand. So I want to add this positive condition on my machine. – Yoo Inhyeok Nov 08 '19 at 09:35
  • if demand $\in [0, 50]$ then you can normalize it to $[0,1]$ and use logistic regression and use returned $probs$ (or any sigmoid-like head for NN)..., also machine learning require data to learn (upfront or by reinforced learning), because you have to find correct values of coefficients in your model – quester Nov 08 '19 at 21:46
  • @quester So changing a regression problem to a classification problem is the answer? – Yoo Inhyeok Nov 11 '19 at 08:50
  • @YooInhyeok no it's more choice of "decision function"/"head for NN" to use sigmoid-like because this approach guarantees that model will output values from certain interval and use l2 loss because you would like to model how much units you would like to order so it's a regression problem – quester Nov 11 '19 at 09:18

1 Answers1

1

If your model is $f$ you can always transform the outputs, so that they meet the constraint. For your example you could use something like

$$ g(x) = 50-\exp(-f(x)) $$

and instead of optimizing the loss function between $y$ and $f(x)$, minimize loss between $y$ and $g(x)$. You would be seeking for such parameters that make the transformed predictions fit best to the data.

Tim
  • 108,699
  • 20
  • 212
  • 390
  • Thanks! So modifying both output function and loss function is your answer, right? One more question. Why did you use $-exp(-f(x))$? – Yoo Inhyeok Nov 11 '19 at 08:53
  • @YooInhyeok it maps real line to increasing, positive values. You can choose also other functions, this is just example. – Tim Nov 11 '19 at 09:10
  • I'm sorry but I don't understand what you mean. Can you give me more detail explanations for the function $g(x)$? – Yoo Inhyeok Nov 11 '19 at 10:01
  • @YooInhyeok what exactly is unclear for you? – Tim Nov 11 '19 at 11:21
  • I think the function $g(x)$ is the output of the model in the case. I want to know how you derive the function $g(x)$. In other words, why is $g(x)$ my constraint? – Yoo Inhyeok Nov 12 '19 at 09:44
  • @YooInhyeok is is not a constraint. It transforms the output of the regular model $f$, so that it fits the restricted range, by just mapping it to the range. You don't need to add any constrains or change the loss function. It works as an activation function in neural networks, or link function in generalized linear models. – Tim Nov 12 '19 at 10:24
  • Okay I understood what you mean. What I asked is about how the function $g(x)$ was derived. I believe that I must consider the range of $x$ for deriving $g(x)$ because when $exp(-f(x))$ is larger than 50, then the whole function $g(x)$ will be negative. – Yoo Inhyeok Nov 12 '19 at 12:00
  • Then what about integer constraints? – Yoo Inhyeok Nov 12 '19 at 12:01
  • @YooInhyeok why exactly do you need integer constraints? In regression models we usually do not use such (even when working with count data). You can always round the values after training the model & making predictions if you really need, but usually having more precise real-values predictions make more sense. – Tim Nov 12 '19 at 12:29
  • Okay. I was just curious. I appreciate for the nice answers! – Yoo Inhyeok Nov 12 '19 at 12:41
  • @YooInhyeok as for the function itself, check the plot https://www.wolframalpha.com/input/?i=50-exp%28-x%29 , you can play around with the +/i signs or with the 50 constant to check what happens. – Tim Nov 12 '19 at 14:32