I am using income data from the Current Population Survey for a small undergrad economics paper.
In economics, there is evidence that the income of 97%–99% of the population is distributed log-normally. The distribution of higher-income individuals follows a Pareto distribution.
I have used kernel density estimation to plot the lower 99% and the graph does appear to be log-normal. But I would like to estimate mu and sigma; how do I go about this?
I have been reading about maximum likelihood estimation. But I'm just not sure how to calculate this when I have 200,000 rows of information. Do I have to write my own algorithm to sum over all of my x's? Or is there a built-in function I could use?
I would ideally like to do this in R or Stata.