How to do hierarchical linear regression without builtin hierarchical structure

Question

I have a number of machines of different configurations, and for each machine, I monitor a couple of parameters and build a linear model for productivity ~ monitored parameters. Now I need to predict productivity for some new machines which do not report productivity back to us. My plan is to cluster those models into groups (probably build a groupwise model) and link each group to some configuration range (some configurations are numerical). So I can assign the new machine into a group based on its configuration and apply that gorupwise model on it. What is the best way to cluster those models?

Updated: It might be easier to explain it using another example, say I have this data

studentID IQ breakfast_consumption grade
1         100               50        3
1         100               100       4
2          80               100       4
2          80               80        2

I build models for each student with grade ~ breakfast_consumption. Now we have a new student and we have his IQ and breakfast_consumption. How can I predict his grade? By the way, I did try to build a model for all students with grade ~ IQ + breakfast_consumption, but the r2 is much lower than the individual models.

"What is the best way to cluster those models?" -- "I have a number of machines of different configurations"...shouldn't configurations be the cluster criteria? Or could you provide more information/context? — Jon, Dec 09 '16 at 17:58
In this case, similarity of configurations doesn't necessarily imply similarity of behaviour/models. That is why I want to do it in the opposite direction, which is clustering machines based on the similarity of their models first, and then check the configurations within each cluster — user141657, Dec 09 '16 at 18:19
I am not sure whether it makes sense to cluster based on coefficients and intercept, or there are better ways to do it. — user141657, Dec 09 '16 at 18:49
How would you cluster on intercept? I'm not sure what that means. When you cluster, a cluster (random/fixed effect) is a coefficient. There seems to be some vagueness. Could you maybe elaborate a bit more on the design of the experiment. — Jon, Dec 09 '16 at 18:54

score 1 · Answer 1 · edited Apr 13 '17 at 12:44

So, in your example there appears to be repeated measure for students. In that case, your hierarchical structure would be something like grade ~ breakfast_consumption + (1|studentID), where I assume there is a varying average by student; this also helps to account for autocorrelation of observations within student. You can also include IQ as a second hierarchical variable (random effect), as I assume each student only has 1 IQ score, else, they can just be a fixed effect.

Now, if you want to predict grade on a never before seen studentID, then this may be tricky. Depending on the software you're using, you may encounter some problems introducing a new cluster label (factor/random effect).

Now, the easier approach would be to scrap what I said above.

So I can assign the new machine into a group based on its configuration and apply that gorupwise model on it.

What I've done in the past, as well as co-workers of mine, is to perform a clustering algorithm on your data set to create clusters. My co-workers (as well as most people) used k-means. I'm apprehensive to use k-means for anything than toy examples. You can review the following to see the drawbacks How to understand the drawbacks of K-means.

I recommend something like dbscan or cluster algorithm that uses mixture distributions; basically model-based clustering.

Steps

Using appropriate variables/characteristics, cluster your students (machines) and save the labels into a new variable, say, groups
Fit/train your hierarchical model as, grade ~ breakfast_consumption + (1|group + IQ)
Discriminant analysis for new students (machines); using your cluster results, classify the new students into appropriate clusters.
Use your trained model to make your needed predictions.
(optional) if there is a stream of many new observations coming in, you may need to retrain your models on accumulated data over time.

This is a rough sketch of steps I've taken in the past, but I hope it helps you out in your work.

How to do hierarchical linear regression without builtin hierarchical structure

1 Answers1