Data or Theory Driven?
The first issue is whether you want the composite to be data driven or theory driven?
If you are wishing to form a composite variable, it is likely that you think that each component variable is important in measuring some overall domain.
In this case, you are likely going to prefer a theoretical set of weights. If, alternatively, you are interested in whatever is shared or common amongst the component variables, at the risk of not including one of the variables because it measures something that is orthogonal or less related to the remaining set, then you might want to explore data driven approaches.
This question maps on to the discussion in the structural equation modelling literature between reflective and formative measures
( e.g., see here).
Whatever you do it is important to align your measurement with your actual research question.
Theory Driven
If the composite is theoretically driven then you will want to form a weighted composite of the component variables where the weight assigned aligns with your theoretical weighting of the component.
If the variables are ordinal, then you'll have to think about how to scale the variable.
After scaling each component variable, you'll have to think about theoretical relative weighting and issues related to differential standard deviations of the variable.
One simple strategy is to convert all component variables into z-scores, and sum the z-scores.
If you have component variables, where some are positive and others are negative, then you'll need to reverse either just the negative or just the positive component variables.
I wrote a post on forming composites which addresses several scenarios for forming composites.
Theoretical driven approaches can be implemented easily in any statistical packages.
score.items
in the psych
package is one function that makes it a little easier, but it is limited.
You might just write your own equation using simple arithmetic, and perhaps the scale
function.
Data Driven
If you are more interested in being data driven, then there are many possible approaches.
Taking the first principal component sounds like a reasonable idea.
If you have ordinal variables you might think about categorical PCA which would allow the component variables to be reweighted. This could automatically handle the quantification given the constraints you provide.