Consider a new variable that is linear combination of original variables (e.g. one of the principal components). How can we find out which original variables "product" the new variable?
-
1Could you elaborate please ? Do you mean you have a set of variables, a linear combination of some of these variables, but you don't know which ones ? – Stéphane Laurent Aug 14 '14 at 08:58
-
yes, I have one variables that is combination of some other variables. now, we want to know which variables is used for construct new variable. – sara Aug 14 '14 at 09:15
-
Could you edit your post to add that ? And what is the link with PCA ? If your linear combination comes from a PCA analysis, why do you not know what are the variables ? – Stéphane Laurent Aug 14 '14 at 09:25
-
my linear combination is production of sufficient reduction analysis. – sara Aug 14 '14 at 09:39
-
in PCA analysis, how do you know what are the variables? – sara Aug 14 '14 at 09:40
2 Answers
If you have a number of variables $\{x_1, x_2, \dots x_N\}$ and you consider a new variable that is a linear combination of the original ones: $$y=\alpha_1 x_1 + \alpha_2 x_2 + \dots + \alpha_N x_N = \sum \alpha_i x_i$$ Then, obviously, all original variables are "used to construct" the new variable (unless some of the coefficients $\alpha_i$ are equal to zero). However, if some coefficients are $\approx 0$, then you can say that the new variable "mostly" depends only on a subset of the original variables --- on those that have large coefficients.
When you perform PCA, your e.g. first principal component is a linear combination of the original variables. So what you want to do, is to look at the coefficients $\alpha_i$ and see which ones have large absolute values (and which ones are close to zero).
How to actually do it, depends on your software, programming language, etc.

- 93,463
- 28
- 275
- 317
-
Sufficient reduction analysis image predictors on subspace that is smaller than predictor's subspace. so in fact, we have new predictors that is combination of initial predictors. now what is your idea about this? can you help me for finding coefficients that are equal to zero? – sara Aug 14 '14 at 09:59
-
I don't know what "sufficient reduction analysis" is; what is its relation to PCA? In any case, if your new variables are linear combinations of original variables, then you should be able to look at coefficients of the linear combination ("loadings" in PCA terminology). Then you can check what coefficients are zero, or close to zero. Is your problem that you cannot access these loadings? Then it becomes a question about your particular software / programming language. – amoeba Aug 14 '14 at 10:05
-
@sara: You are welcome, but I am not sure if I clarified the issue for you... If not, please edit your question to provide additional information and specify your problem better (if it refers to particular software or language, you should mention that). If yes, consider "accepting" the answer by clicking on a green tick sign nearby. And welcome to CrossValidated! – amoeba Aug 14 '14 at 10:20
-
Going back to our [conversation](http://stats.stackexchange.com/a/35653/3277), I'd like to discourage people to throw about with word "loading" everywhere. For me, **loading** is a coefficient in linear combination of scoring observed variables by latent variable(s), not vice versa (vice versa are "weights"). With PCA's unaltered eigenvectors, their elements are weights=loadings. In other settings and analyses weights and loadings usually differ, although they can be easily transformed one in another. My terminologic "purism" is not about maths, it is for didactics. – ttnphns Aug 14 '14 at 11:26
-
@ttnphns: Point taken. I removed the word "loading" from my answer altogether. – amoeba Aug 14 '14 at 11:34
-
@amoeba, You don't have to "obey" me. I simply have seen many psychologists who, when have done factor analysis to develop their scale, select items on the basis of factor loadings (which's right) and then score the scale total using that same loadings! (instead of inverting them into weights). – ttnphns Aug 14 '14 at 11:42
-
1@ttnphns, in fact I find the word "loading" only confusing, so I would rather refrain from using it whenever possible (hence I was happy to make the edit). My background is math and machine learning, and the word "loading" is not much used there, it comes more from the psychology literature on factor analysis. – amoeba Aug 14 '14 at 11:54
After having viewed the loadings in each factor, I use PCA as a tool to propose how to construct a new variable (that could be used elsewhere) that intuitively encompasses a shared attribute that have been outlined by a purely statistical technique. I do not recommend embracing a new factor without understanding why those variables could be possibly associated with the underlying factor. Even then, replication with new data, is needed to verify the new construct.

- 1,800
- 1
- 9
- 9