0

I am very wondering why we do not use least squares instead of maximum likelihood?

for example we have 3 choices k= 1, 2 ,3

$minimizing: (e^{\beta_{i} X}/(1+\sum e^{\beta_{i} X})- Y)^{2} $ for i=1,2,3

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
sherek_66
  • 137
  • 6

1 Answers1

1

The short answer is because it is not maximum likelihood estimation, so it is not optimal. Maximum likelihood solves for $\beta$ that makes the observed data most likely to have been observed. The likelihood function for Bernoulli random variables ($Y=0,1$) involves exponents in $Y$, not squares.

Frank Harrell
  • 74,029
  • 5
  • 148
  • 322