(b) What sort of Statistics use-cases would you advise someone to use Julia in
(c) If R is slow at a certain task does it make sense to switch to
Julia or Python?
High dimensional and compute intensive problems.
Multiprocessing. Julia's single node parallel capabilities (@spawnat
) are much more convenient than those in python. E.g. in python you cannot use a map reduce multiprocessing pool on the REPL and every function you wish to parallelise requires lots of boilerplate.
Cluster computing. Julia's ClusterManagers
package lets you use a compute cluster almost as you would a single machine with several cores. [I've been playing with making this feel more like scripting in ClusterUtils ]
Shared Memory. Julia's SharedArray
objects are superior to the equivalent shared
memory objects in python.
Speed. My Julia implementation is (single-machine) faster than my R
implementation at random number generation, and at linear algebra (supports multithreaded BLAS).
Interoperability. Julia's PyCall
module gives you access the python ecosystem without wrappers - e.g. I use this for pylab
. There's something similar for R, but I've not tried it. There is also ccall
for C/Fortran libraries.
GPU. Julia's CUDA wrappers are far more developed than those in python (Rs were nearly non-existent when I checked). I suspect this will continue to be the case because of how much easier it is to call external libraries in Julia than in python.
Ecosystem. The Pkg
module uses github as a backend. I believe this will have a big impact on the longrun maintainability of Julia modules as it makes it much more straightforward to offer patches or for owners to pass on responsibility.
$\sigma$ is a valid variable name ;)
Writing fast code for large problems will increasingly be dependent on parallel computing. Python is inherently parallel unfriendly (GIL), and native multiprocessing in R is nonexistent AFAIK. Julia doesn't require you to drop down to C to write performant code, while retaining much of the feel of python/R/Matlab.
The main downside to Julia coming from python/R is lack of documentation outside of the core functionality. python is very mature, and what you can't find in the docs is usually on stackoverflow. R's documentation system is pretty good in comparison.
(a) Would you advise any new users of statistical tools to learn Julia
over R?
Yes, if you fit the use cases in part (b). If your use case involves lots of heterogeneous work