In this answer I will not differentiate between Factor Analysis (FA) and Principal Component Analysis (PCA), but by default I mean PCA. These two are different. In my environment, when someone says "I do factor analysis" they always do mean PCA, almost never realizing the (subtle) difference.
Analysis on both items and scales can be seen as correct and not interchangeable, as they concern a little different problem. People in psychology usually do item-based FA, and probably that's why your tutor (maybe a little too automatically) asks you do for it. Here are the important differences:
- There is much less information in 12 scales than in 156 items - so you were able to discern much more factors (information) from the items. You can set a limit to factor analysis and get only two factors and hope to expect comparable results, but...
- ...Factor analysis procedure incorporates a prior belief, that all items are equivalently good. This prior will result in a different bias if some scales in your questionnaire are based on very different amount of items then the others. I know many multiple-scale questionnaires with some scales on 2 or even 1 items, and other on as many as 20. In case of such questionnaires the items present in little scales will bear much more weight upon the FA result than the items present in the large scales. And if there are items shared between the scales, they will automatically get greater impact on the factors.
My advise:
Do as your tutor asks - FA on items is also a good and valid procedure.
I doubt your 43 factors is a valid result of a factor analysis - to me it more sounds like a upper bound.
A proper FA rarely ends in number of factors equal to an upper bound given by the Kaiser criterion (leave all factors with eigenvalue greater than one). The procedure calls you to consider all possible factor sets (honouring the Kaiser criterion and possibly VariMax-rotated) for which you could give a good name/meaning for each and every factor found. I usually end up in a solution with half the factors than the upper bound. And analysing so many sets factors (where first set contains 43 factors) is a hard work, which hardly can be automatized (except maybe for a deep neural network ;-) ).
What works for me the best is starting from the factor analysis with maximum number of factors and working my way down until I find either a set of factors for which I can give a clear meaning or reach a scree criterion (inflection point of the scree plot) that gives a left bound for number of factors.
Timothy Brown - Confirmatory Factor Analysis for Applied Research, page 23:
(...) Despite the fact that EFA is an exploratory or descriptive
technique by nature, the decision about the appropriate number of
factors should be guided by substantive considerations, in addition to
the statistical guidelines discussed below. For instance, the validity
of a given factor should be evaluated in part by its interpretability;
for example, does a factor revealed by the EFA have substantive
importance? A firm theoretical background and previous experience with
the variables will strongly foster the interpretability of factors and
the evaluation of the overall factor model. Moreover, factors in the
solution should be well defined—that is, comprised of several
indicators that strongly relate to it. (...)
If you want to test the theory that your questionnaire has exactly two factors - use a Confirmatory Factor Analysis (it is a special case of path analysis AKA Structural Equation Modeling (SEM) ). But that is a different story.