2

Just wondering - if my organisation's data never runs into sizes than are bigger than my instances' memory size, why do I need something like Spark?

I can scale the memory up using cloud instances, these days it seems that you can really push the max memory on the cloud instances. https://aws.amazon.com/ec2/instance-types/

So am I missing something here? What does a Spark-based machine learning solution offer over a high memory single instance? Parallel processing?

Thanks.

lppier
  • 73
  • 1
  • 8

0 Answers0