A powerhouse processing engine.
September 19, 2020
A powerhouse processing engine.

Score 9 out of 10
Vetted Review
Verified User
Overall Satisfaction with Apache Spark
We use Apache Spark for cluster computing in large-scale data processing, ETL functions, machine learning, as well as for analytics. Its primarily used by the Data Engineering Department, in order to support the data lake infrastructure. It helps us to effectively manage the great amounts of data that come from our clusters, ensuring the capacity, scalability, and performance needed.
Pros
- Speed: Apache Spark has great performance for both streaming and batch data
- Easy to use: the object oriented operators make it easy and intuitive.
- Multiple language support
- Fault tolerance
- Cluster managment
- Supports DF, DS, and RDDs
Cons
- Hard to learn, documentation could be more in-depth.
- Due to it's in-memory processing, it can take a large consumption of memory.
- Poor data visualization, too basic.
- Saved time and resources for the company because of it's agility
- High performance data processing.
Do you think Apache Spark delivers good value for the price?
Yes
Are you happy with Apache Spark's feature set?
Yes
Did Apache Spark live up to sales and marketing promises?
Yes
Did implementation of Apache Spark go as expected?
Yes
Would you buy Apache Spark again?
Yes
Comments
Please log in to join the conversation