4

I am looking for a C++ library for statistics to play with outliers detection in time series (amongst other).

What I need:

  • Robust estimators, correlations, hypothesis tests, etc;
  • No dependencies with external libraries;
  • No GPL;

Would be a plus:

  • Lightweight;
  • Free;
  • portable;
  • Active and supported;
Korchkidu
  • 141
  • 1
  • 2
  • does your third point mean "Must not be GPL" or does it mean "Need not to be GPL"? – psj Jun 01 '12 at 09:54
  • we will eventually use it in proprietary software. GPL is not compatible with that right? – Korchkidu Jun 01 '12 at 10:04
  • I'm not an expert on licensing, it seems to depend what you imply with "proprietary" and and "compatible". If you want to sell a closed source application using GPL code, then yes, that could be a problem (AFAIK). – psj Jun 01 '12 at 10:24
  • Yes, this is what I understood too. But I believe that with GPL, you can be asked to disclose all your software using GPL code even if it is internal... – Korchkidu Jun 01 '12 at 10:31
  • You might want to also detect seasonal pulses as they are not outliers but systematic. Also a level shift/step shift/intercept change is a series of contiguous "outliers" which have the same size and sign , these should also not be confused with "single outliers". – IrishStat Jun 01 '12 at 12:55
  • Do you guys know whether importing R in C++ works? – gui11aume Jun 01 '12 at 15:22
  • 1
    @gui11aume: you mean something like this?: http://dirk.eddelbuettel.com/code/rcpp.html – psj Jun 01 '12 at 16:06

1 Answers1

1

For the statistical part of your question, look in the /src folder of the .tar.gz file here. You'll find pointers to a selection of papers in the manual (pdf file in the same link). This package is a collection of existing real time version (i.e. amortized cost of O(1)) of all state of the art uni-variate outlier detection procedures. I'm not involved in it, but i can't recommend it enough.

For the licensing part of your problem, you may have a look here.

For most of these algorithm, the codes in that package are the only cpp implementations of these procedures i know of. Without knowing what the terms of the R licences are, i suppose you could still use the codes there to test them and re-implement your preferred ones under your own terms.

user603
  • 21,225
  • 3
  • 71
  • 135
  • Thanks. I got the C++ source files but I have no idea how to use them...;( Do you have any documentation or even a simple main function calling all the filters one by one? – Korchkidu Jun 01 '12 at 16:32
  • The main functions are in robust.cpp. The SEXP interface is explained in numerous places, say here: http://www.stat.harvard.edu/ccr2005/R/dot-Call.pdf .It is called at line 122 of the file /R/robfilter.R. All SEXP types have correspondance to generic c-types so by reading through robfilter.R you should find the generic "c type" of each of the inputs. With this info, you can easily re-warp "SEXP robustRegression" unto a proper main() function. Does that help? --I assumed basic familiarity with c. – user603 Jun 01 '12 at 19:14
  • oh, ok. cpp files are just interface to R script right? So, I will have to embed R engine in some way? Sorry, I know C and C++. I do not know R at all. – Korchkidu Jun 03 '12 at 04:47
  • I don't see that R cannot be by-passed, it seems to only be called in the robust.cpp file for some R-specific data types. The R-scripts are in /R/ the .cpp files are the cplusplus codes. – user603 Jun 04 '12 at 23:31