The task is to build a regression model for individuals. I have all the independent variables for each individual, but the dependent variable only as an aggregates on group-level.
Lets say, I am trying to predict the score a student will achieve at some test. I have information about the student that can be used as predictor-variable (f.e. time spent studying). But the results of the test are only given as aggregated sums for each class. I can link every student to a class, but I don't know individual test-results.
One potential way I can think of would be to aggregate the independent variables too and run the regression completely on the aggregated data. But it's probably rarely the case that correlation on aggregated level and individual level are the same. So I don't know how to make any judgement about the validity of such an approach.
Is there any 'good'(or less bad) approach to this problem?