A background to my problem: I use survey data on firms, where I want to measure the relationship between a binary variable (perceived growth barriers) and firm size. However, I cannot treat "firm size" as continuous, but I rather need to categorize firms. For this, I have chosen to categorize them based on their statistical relationship to the dependent variables.
My approach to fit categories has been to run regressions where I try out dummy variables for all consecutive firm size intervals using OLS (250 regressions per round). I have then categorized the first size category based on which of them has the highest $R^2$, after which I have repeated the process until all sizes are categorized.
However, my data exhibits high variance among larger firms, which means that I cannot use $R^2$ alone as it would only end up creating "overly wide" categories. Therefore, I have also weighted each $R^2$ output with the estimated Kernel density of the bandwidth where the categories end (e.g., a category containing firms with 4-16 employees would be weighted by the Kernel density of the size "16 employees"). This was made to "slow down" the regression algorithm and to force it to include influential groups that are relevant to my research.
However, this last solution was made ad-hoc and not with respect to previous research (on which I found none with respect to creating categories).
My question is now:
Are there any alternative model fit measures to $R^2$ that is perhaps less sensitive to heteroscedasticity in the data? (i.e., ideally a measure that would not require the use of Kernel density weights to solve the above problem).
Alternatively, do you have any suggestions on improvements or alternative approaches to solving this issue?