Sparkling Spark
June 26, 2017

Sparkling Spark

Sunil Dhage | TrustRadius Reviewer
Score 10 out of 10
Vetted Review
Verified User
Review Source

Overall Satisfaction with Apache Spark

It's being replaced as the traditional ETL tool and we are using Apache Spark for data science solutions.
  • It makes the ETL process very simple when compared to SQL SERVER and MYSQL ETL tools.
  • It's very fast and has many machine learning algorithms which can be used for data science problems.
  • It is easily implemented on a cloud cluster.
  • The initialization and spark context procedures.
  • Running applications on a cluster is not well documented anywhere, some applications are hard to debug.
  • Debugging and Testing are sometimes time-consuming.
  • Time saved in developing applications is less.
  • ROI on time, resources, money.
  • Can replace the traditional database systems.
It's well suited for ETL, data Integration, and data science problems of large data sets. It's not at all suitable for small data sets which can be done on desktops and laptops using the Python tool.

Evaluating Apache Spark and Competitors

Yes - Microsoft Server
  • Price
  • Product Features
  • Product Usability
  • Product Reputation
  • Prior Experience with the Product
  • Vendor Reputation
  • Existing Relationship with the Vendor
  • Analyst Reports
It works on all kinds of data unlike SQL Server which needs structured data.
I would think ofROI and resource allocation as the most important factors.