Want to save dollars, resources and time processing big data, switch to Apache Spark
March 27, 2019
Want to save dollars, resources and time processing big data, switch to Apache Spark
Score 9 out of 10
Vetted Review
Verified User
Overall Satisfaction with Apache Spark
We sold a data science product to one of the leading US-based e-commerce firms. Suddenly, their data started growing at a very fast rate. The product, at this stage, was based on R programming. With such huge data, the product started taking a lot of time. We then started thinking of an alternative to R, to process multiplying big data such as this client has. We eventually came across Apache Spark. With the permission of the client, we started switching the codes from R to Apache Spark. It took a very long time to learn and code in Spark, but it was worth the effort. The R codes, which were taking days of time to run, came down to a few hours.
- Very good tool to process big datasets.
- Inbuilt fault tolerance.
- Supports multiple languages.
- Supports advanced analytics.
- A large number of libraries available -- GraphX, Spark SQL, Spark Streaming, etc.
- Very slow with smaller amounts of data.
- Expensive, as it stores data in memory.
- We saved a lot of time and resources, thereby saving a lot of dollars for our company as well as the client.