I started to work with linear models recently and have a few questions about LMs and GLMs. My target looks like this (see below) and I have approx. 20 features.
Can somebody please confirm or answer my questions? Feel free to use the questions to test yourself.
- If I normalize my features before fitting an LM, I can say: the higher the absolute value of my regression coefficient, the more important is the feature?
- An LM is not a good choice because the target is not normally distributed?
- If two features are highly correlated I can delete one of them?
- If I have very strong correlated features, I should keep only one for fitting (multicollinearity)?
- If any of the correlation coefficients (e.g. pearson or kendall) is very close to zero between feature X and the target, I can delete it?
- Ẃhich link function and distribution should I assume for my GLM? I assume Poisson does not work since it is a discrete distributions and my values are continuous; a Gamma distribution can work as long as I have no 0 in my targets?