2

I want to do a regression analysis (as I have found out on What regression model to use when independent variables are percentages to predict % outcome? should probably be a logistic regression) but I am not sure if I want to do the right one. My dependent variable is a mark that could be anywhere between 1 and 10 (so no binary outcome like in the link), although the min and max are around 4 and 8. I do have different independent variables however, of which most are percentages. If it would have been a linear multiple regression analysis most of these variables would have been dummies. The simplest one to describe is gender; (% of male and % of female), but for instance I also have a variable which consists of 5 options (A 20% B 0% C 30% D 15% E 35%). So in my dataset i have 5 different columns all for this one variable. This all besides 'regular' variables, like population density and budget (/inhabitant). I hope I desribed this clear enough.

Could anyone help this nitwit out? I am struggling for months now and I make no progress, the only progress is that I keep finding out that what I have been doing so far makes no sense. The program I am using is SPSS so I hope it is possible with this program.

Thanks in advance!

Edit: Hi, first of all, thanks for the reply. The DV is an average per municipality of around 200 questionnaires where citizens were asked to rate their satisfaction from a 1 to 5 scale, later transformed to a 1 to 10 scale. In practise all averages are between 7.1 and 8.1, so I guess it's an ordinal scale. Now concerning the relationships; I basically got 2 models. The first one I just look at the diversity of employees (age (3 categories), etnicity (3 categories) and gender (obviously 2 categories)) which I express in the % of the total population of employees locally. The second model I run looks at the proportion of these percentages from the local employees compared to the proportion of the local citizens, which could be as low as (1:)0,05 to around (1:2). In the first model I expect that the more diverse the organisation is, the better the satisfaction of the citizens; in the second model I expect that municipalities close to a 1:1 ratio have a higher satisfaction than municipalities which differ more from this 1:1.

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
TeeVeeZee
  • 21
  • 2
  • 1
    The kind of model is determined by the response (DV) and by the kind of relationships you expect with the IVs. Variables that would normally be dummies in a multiple regression model will generally remain dummies in other models. Are the values taken by your DV ordered categories, or are they numeric values? – Glen_b Jul 06 '15 at 10:39
  • Hi, first of all, thanks for the reply. The DV is an average per municipality of around 200 questionnaires where citizens were asked to rate their satisfaction from a 1 to 5 scale, later transformed to a 1 to 10 scale. In practise all averages are between 7.1 and 8.1, so I guess it's an ordinal scale. – TeeVeeZee Jul 08 '15 at 09:42
  • If you averaged the individual ratings, *at that point you already assumed the scale was interval*. – Glen_b Jul 08 '15 at 11:15
  • http://www.ats.ucla.edu/stat/sas/whatstat/ so my kind of data is suitable for a multiple regression analysis after all? – TeeVeeZee Jul 10 '15 at 10:52
  • Just because the scale was assumed interval doesn't automatically mean linear regression will be a suitable model; it might be reasonable if the relationships are close to linear and the variance is near constant. (Sorry, why did you link to that document?) – Glen_b Jul 10 '15 at 11:10
  • I linked it so you could follow how I got to that conclusion (I have 1 DV and (1 or) more interval IV's). – TeeVeeZee Jul 10 '15 at 11:44
  • Oh, okay thanks. I'd probably just treat the averages in the ratings as numeric, but I'd want to look at relationships between that and the IVs, and how the spread or skewness changed with mean; a quasibinomial GLM or a beta regression might be more suitable if there's a tendency for the DV to be up toward one end or the other for some regions of the IV-space. – Glen_b Jul 10 '15 at 12:25

0 Answers0