If I can put all my data in memory, why do I need frameworks like Spark?

Asked Dec 17 '19 at 11:38

Active Dec 17 '19 at 11:38

Viewed 29 times

Just wondering - if my organisation's data never runs into sizes than are bigger than my instances' memory size, why do I need something like Spark?

I can scale the memory up using cloud instances, these days it seems that you can really push the max memory on the cloud instances. https://aws.amazon.com/ec2/instance-types/

So am I missing something here? What does a Spark-based machine learning solution offer over a high memory single instance? Parallel processing?

Thanks.

asked Dec 17 '19 at 11:38

lppier

If I can put all my data in memory, why do I need frameworks like Spark?

0 Answers0