0

how do I find y values if I am given a list of data for the following variables x1, x2, x3...x11 and I am provided the y mean 0.010451 and the y standard deviation 0.004336?

I tried a multivariable regression.... y = b1x1+ b2x2+ b3x3...b11x11 then solve for the betas and setting objective to the provided mean value of y and set the constraint for the provided value of the standard deviation of y, that didn't work.

I've looked into unsupervised learning techniques which I believe might be a way to solve for the solution. Looking at PCA and using the PCA values to help solve for a beta. Solving for the correlation in the following equation:

http://www.stat.wmich.edu/s216/book/node126.html

Principal component mean is 0, in the above bX = 0 as the mean of a principal component is 0 thus a, the intercept is = to the mean of Y. Using solver for correlation on the former equation and using the standard deviation of PC1 (principal component 1) and standard deviation of Y produces a beta. Y= PC1*B + a However, this exercise does not seem to be meaningful because the standard deviation of x is so small and doesn't appear to do much switching the correlation.

Any insights would be great.

Kat
  • 1
  • 1

1 Answers1

0

If this is all you have, there is no way to do this. You can't do a multiple regression (which you tried) because you don't have the individual y values. You can't do unsupervised learning because that won't give you Y values. You want supervised learning but you have nothing to supervise the learning with. PCA is not going to help either because that is a dimension reduction method, not a way to solve for Y.

Sorry, but there isn't a solution except to use the mean of Y for every y value.

Peter Flom
  • 94,055
  • 35
  • 143
  • 276
  • If my answer solves your problem, the usual practice is to accept it. (I only mention this because you are new here - and welcome to CV, by the way!). – Peter Flom Mar 04 '18 at 16:36
  • This answer is correct but for an invalid reason: It is indeed possible to do multiple regression without having the individual y values. See https://stats.stackexchange.com/questions/107597. However, since such a regression yields information about (empirical) correlations among the x and y values and such information is absent in this case, the regression is not possible. In fact, this question gives absolutely no information about how the x variables and the values of y might be related. See – whuber Mar 05 '18 at 14:20