1

I am posting this, hoping that it will also be useful to others. see also Fitting t-distribution in R: scaling parameter .

my data series x is fat-tailed, 1063 obs. it seems straightforward to determine whether I can reject a cauchy:

> library(fitdistrplus)
> ca <- fitdistr(x, "cauchy")

> summary(ca)
Fitting of the distribution ' cauchy ' by maximum likelihood
Parameters :
         estimate Std. Error
location  0.01140   0.001277
scale     0.02612   0.001031
Loglikelihood:  1588   AIC:  -3172   BIC:  -3162

Goodness-of-fit statistics
                             1-mle-cauchy
Kolmogorov-Smirnov statistic      0.07703
Cramer-von Mises statistic        1.21856
Anderson-Darling statistic       12.35285

Goodness-of-fit criteria
                               1-mle-cauchy
Aikake's Information Criterion        -3172

I would like to know whether the data can reject the null hypothesis that x was drawn from the cauchy. It would be nice if fitdist gave an interpretation of the statistics itself, but I think I can explain it. I just need to compare, e.g., my K-S of 0.077 to the K-S critical value of 1.63/sqrt(length(x))=0.05. we can reject the null that the data is from a cauchy.

now, the cauchy and normal are both special cases of the Student-T. I would like to see where in the family of student-t distributions x would be, and how well freeing this parameter helps compared with the cauchy. alas, fitdistr first needs a df provided as a parameter. I can match the kurtosis of x to determine the best match. a student-T has a kurtosis of 6/(df-4), so for my x with kurtosis of 10.7, my df should be about 4.5. and now,

 t4 <- fitdistr(x, "t", df=4.5)

which for me gives a t4$loklik of about 1699.

the coup de grace should be to test how good this freed-up df performs vs. the restricted cauchy with something like

 library(lmtest)
 lrtest( ca, t4 )

i.e., something like 2*log-LR compared to the Chi-sq. with a difference of 111, and we are way above the 10.8 critical level at 0.1% for a dimensionality difference of 1. is this built into R fitdistr somehow, but I just overlooked it? right approach?

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
ivo Welch
  • 121
  • 5
  • when you refer to your data as a 'series' ... are your values observed over time? – Glen_b Apr 12 '15 at 22:38
  • 1
    indeed in my application. and the iid assumption is heroic. but, this question is not just my specific application. it is generic. – ivo Welch Apr 13 '15 at 02:32
  • can you share (a subsample) of the data here? – Georg M. Goerg Jul 08 '15 at 10:47
  • http://mba.tuck.dartmouth.edu/pages/faculty/ken.french/data_library.html – ivo Welch Jul 08 '15 at 15:24
  • @ivoWelch thanks a lot for the link, but which of the many files is your particular dataset? – Georg M. Goerg Jan 18 '16 at 04:42
  • Fama-French, 3 factors, daily at http://mba.tuck.dartmouth.edu/pages/faculty/ken.french/ftp/F-F_Research_Data_Factors_daily_CSV.zip is a great starter for daily returns on some big portfolios. – ivo Welch Jan 18 '16 at 16:35
  • @ivoWelch thanks for the pointer. in your OP you have 1063 observations; the dataset you pointed to has 23,000 observations. Any particular subset I should use? Or is your question general, not specific to your particular 1063 observations dataset? And also: are you really interested in fitting/testing a Cauchy specifically, or more generally test for existence of first k moments / estimating tail parameters? – Georg M. Goerg Feb 15 '16 at 23:38
  • there are daily and monthly data. I probably used monthly. you probably used daily. it was general. – ivo Welch Feb 16 '16 at 03:13
  • @ivoWelch can you edit the original post to clarify whats your exact question you want to know. I am not sure if you ask about fitting t distributions in general, testing "H0: X ~ Cauchy", a nice interface for likelihood ratio tests in R, ... – Georg M. Goerg Feb 17 '16 at 02:41

0 Answers0