If I create machine learning models in Python or R, is it possible to export the models in a format that could be imported by spark MLlib?
Asked
Active
Viewed 4,572 times
1 Answers
6
If evaluation of your model in Spark is sufficient, you could look into PMML as an exchange format. Both Python and R can generate it for some models, for example: https://support.zementis.com/entries/37092748-Introducing-Py2PMML
Spark can evaluate PMML using this library: https://github.com/jpmml/jpmml-spark
Going the other way, you can also export PMML from Spark MLib.

Nascif
- 76
- 4
-
In addition to this, OP can use Jupyter which supports running R inline in cell with "%R" magic. R also has it's own SparkR: https://spark.apache.org/docs/latest/sparkr.html – Alex R. Apr 06 '16 at 18:16
-
You can export Scikit-Learn models to PMML using the sklearn2pmml package: https://github.com/jpmml/sklearn2pmml It supports 50+ different Scikit-Learn estimator and transformer types, and can make a guarantee about the reproducibility of Scikit-Learn predictions. And being a part of JPMML family, it really plays nice together with JPMML-Spark and related products. – user1808924 Jun 21 '16 at 08:05