Validation of a questionnaire in a new population

Question

I have 400 responses to a 20 item questionnaire which purports to measure an attitudinal constuct in medical students. The instrument was validated in the US for a single year of medical students and the published data is very "clean"- all ritc values >0.3,alpha 0.84, PCA with a stable four factor structure etc. In my sample I have found 5 of 20 items to have ritc<0.2 and in a cultural subpopulation ( n=70) these ritc values are zero/negative. If I retain all itmes, those with poor ritc either do not load on any factor or sort into a 2-item factor toegther ( factor 4). I hyporthesize that (& would like to investigate) this is due to either (i) a small cultural subpopulation for which the construt may be poorly captured, or (ii) beacuse I have responses from students across all stages of a programme and there is a developmental aspect to the construct poorly captured by the scale items. Is there a statistical test which will allow me to investigate this?

Should items with ritc be deleted from the scale and if so do I do this sequentially starting with the lowest and at what point should I stop deleting items/ have I lost something from the questionnaire? If I want to compare the scale's factor structure between the major and minor subpopulations, how do I attempt this or is the minor subsample too small to draw conclusions? Any references would be greatly appreciated.

Finally, the purpose of validating the scale is to use it to determine effectiveness of an intervention using a pre & post intervention score - if an item has a low ritc, I presume it may impact on the reliability of the scale in an experimental setting, or am I incorrect? Is there any statistical way to determine the utility of a scale designed to measure constructs which have a developmental aspect- ie do all items function appropriately as the student develops "more" of the attitudinal construct?

+1 to @Gung . I don't know either and my degree is in psychometrics. — Peter Flom, Jan 11 '14 at 22:57
Hi- sorry didn't come out as I had expected when typing- it is the corrected item- total correlation value for each item of the scale — suzi, Jan 12 '14 at 00:20
You have a lot going on. With enough back and forth you might get a couple of your questions answered in a satisfactory way through this site, but to make more progress than that you'll probably need the in-depth help of a consultant. — rolando2, Jan 12 '14 at 00:33
I agree with @rolando2. One quick thought though, I am not sure it is sound to rely on a factor analysis for 20 items based on 70 respondents. — David, Jan 26 '14 at 22:44

doug.numbers · Answer 1 · 2014-03-22T12:29:25.267

@suzi One of the properties upon which Rasch analysis is based is that measures are invariant to subgroups. This property supports the development of computer adaptive testing and test equating. If the this invariance of measure holds true in a population, then there is no differential item functioning (DIF). To assist you with your sample, you could run a Rasch analysis for each subgroup and compare the item functioning of each item for each subgroup. If the item measures differ by more than 0.50 logits (or greater than the 95% confidence intervals of the measures), then DIF is present and the item is not invariant. As long as your subgroups have no fewer than 70 subjects, you should be okay.

An excellent paper on applying this principle is "Rasch Fit Statistics as a Test of the Invariance of Item Parameter Estimates", Smith, Richard M. and Suh, Kyunghee, Journal of Applied Measurement 4(2) 153-163.

As stated in the comments, this is a large field and you might need help. If a paper is possible, you might seek help through the Rasch SIG. Software would include Winsteps, Facets, RUMM, eRm, and other programs in R.

Hope this helps.

Isn't it differential item functioning? – Behacad Mar 22 '14 at 08:26 — Behacad, Mar 22 '14 at 08:26

Validation of a questionnaire in a new population

1 Answers1