Apache Spark: Lightning-Fast Distributed Computing with a Learning Curve
- Fault-tolerant systems: in most cases, no node fails. If it fails - the processing still continues.
- Scalable to any extent.
- Has built-in machine learning library called - MLlib
- Very flexible - data from various data sources can be used. Usage with HDFS is very easy