1

I have a dataset in which one of the categorical variable which is used as a predictor in all of my models has unequal frequencies of its different levels. For example - "Media Type" has multiple levels: TV, Radio, Newspaper, Online. But one of them - e.g. Radio, has only a small amount of observations (it's not a problem of sampling, it reflects the reality...). The problem is that my main interest is to estimate the effect of this predictor on some outcome variable (e.g. whether the medium in which an ad is shown influence the propensity to buy - this is just a simple example..) - I get a lot of insignificant results, and I believe that I should run some specific configuration, but i don't know what to look for. I'm not sure even what is the name of this phenomenon (is it related to stratification?), so I don't know what to search... I'm planning to test multiple models, some predict binary variable (logistic regression) and some linear with continous dependent variable.Even only knowing the name of the penomenon will help me a lot in searching for a solution!

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
user3017075
  • 187
  • 8

0 Answers0