2

i try to find an algorithm that meets my expectations.

Following Scenario: I want to predict the delivery of natural ressources (ores). So i have data with composition of the ressources from a delivery (with date and supplier). Now i want to predict the composition of the ressources, that the supplier x will send me at date y.

You can see the composition as a vector A = {a1,a2,a3,a4,a5}. a1 represents the occur of a material in percent in A. So A sums up to 100%. a1 depends on the old data and a2,a3,a4 and a5.

So i have the input X and Y and want to predict the vector A. Wich algorithm should i use to train my model? Is there any algorithm to predict A with X and Y as input and the old deliveries?

I was thinking of multivariante regression, but i need multiple outputs too and dependences between a1,a2,a3,a4 and a5.

I`m looking forward to your ideas!

Edit: The structure of the data looks like (These are dummy data, but very close to the real one. The dummy data have the same temporal evolution as the real data) :

enter image description here

As you can see there different supplier. The same supplier id means that the ressource is form the same location. Because of this the input should be the date of delivery and the supplier.

Note: I only want to predict deliveries from current suppliers (that means i have data about the composition from older deliveries). I do not want to predict the composition of a delivery from a new supplier. These are all long-term-suppliers, so i have data from the last 10 years. The amount of a material in a delivery fluctuates (because the ore vein is not solid block), but is always in the same are (for example material 1 is always between 75% and 85%).

TheJed
  • 21
  • 3
  • 2
    multivariate regression **is** regression with multiple outputs. Regression with multiple inputs is called multiple regression. See https://stats.stackexchange.com/questions/2358/explain-the-difference-between-multiple-regression-and-multivariate-regression But the multiple outputs is the least of your troubles: you can use the softmax function (or look on this site for "multivariate logistic regression"). The real issue here is the temporary component, which makes this look like time series modeling more than multivariate regression. Can you please show some examples of your data?... – DeltaIV Dec 03 '17 at 14:44
  • ...in particular, plots of $a_i$ as a function of $x$ (categorical: the specific supplier) and $y$ (a date) would help. Finally, do you always buy from a specific group of suppliers, or can you buy from a vast (potentially infinite) population of different suppliers? You need fixed effects in the former case and random effects in the latter. – DeltaIV Dec 03 '17 at 14:46
  • 1
    I edit the post with data examples (these are dummy datas, but they show the structure). – TheJed Dec 03 '17 at 15:05
  • Thanks, that's better (+1). However, we still need more info (or I do, but I think the same goes for other users). Is it ok to assume that the only suppliers you want to buy from, are those in your training set (e.g., 1223, 2342, 34223 and 34534 in your example), or do you also want to consider the uncertainty due to the possibility that you could buy from a similar, but precedently unseen, supplier? Also, do dummy data have the same temporal evolution than your real data? We can't verify that, of course, but you should, making plots. Otherwise we may give wrong and potentially risky advice. – DeltaIV Dec 03 '17 at 15:27
  • 1
    Thanks for your answer DeltaIV. I edit it again: I just want to predict the delivery of a current supplier for whom i have old data. The dummy data have the nearly the same temporal evolution as the real data (i updated the image). – TheJed Dec 03 '17 at 15:56
  • 1
    Multiple regression can have two meanings, as DeltaIV noted, it can refer to multiple inputs. It can also refer to multiple dependent variables, in which case the more correct terminology would be canonical correlation and/or MANOVA-type models. The twist in your case is that your data is compositional. Check out Pawlovsky-Glahn's book *Modeling and Analysis of Compositional Data* or Aitchison's *The Statistical Analysis of Compositional Data*. – Mike Hunter Dec 03 '17 at 18:26

0 Answers0