What users are saying about

1010data

5 Ratings

Apache Spark

96 Ratings

1010data

5 Ratings
Score 4.5 out of 101

Apache Spark

96 Ratings
Score 8.6 out of 101

Add comparison

Likelihood to Recommend

1010data

The software is excellent for any application which is too large for Excel. The visual interface surpasses that of most SQL platforms. It is quite useful for data mining in an exploratory way but less useful in statistical and regression analysis.

Apache Spark

Apache Spark has rich APIs for regular data transformations or for ML workloads or for graph workloads, whereas other systems may not such a wide range of support. Choose it when you need to perform data transformations for big data as offline jobs, whereas use MongoDB-like distributed database systems for more realtime queries.

Pros

  • Crunches huge datasets
  • Has versatility in functionality and grouping capabilities
  • Native XML macro language is easy to master
  • The platform has great visualization tools
  • Ease of use, the Spark API allows for minimal boilerplate and can be written in a variety of languages including Python, Scala, and Java.
  • Performance, for most applications we have found that jobs are more performant running via Spark than other distributed processing technologies like Map-Reduce, Hive, and Pig.
  • Flexibility, the frameworks comes with support for streaming, batch processing, sql queries, machine learning, etc. It can be used in a variety of applications without needing to integrate a lot of other distributed processing technologies.

Cons

  • The ten.do interface could use more detailed documentation
  • Resource heavy, jobs, in general, can be very memory intensive and you will want the nodes in your cluster to reflect that.
  • Debugging, it has gotten better with every release but sometimes it can be difficult to debug an error due to ambiguous or misleading exceptions and stack traces.

Usability

1010data8.0
Based on 1 answer
That's votes by our team.
No score
No answers yet
No answers on this topic

Alternatives Considered

While we have used SQL, 1010data is really the only industry standard product available for our use.
Spark in comparison to similar technologies ends up being a one stop shop. You can achieve so much with this one framework instead of having to stitch and weave multiple technologies from the Hadoop stack, all while getting incredibility performance, minimal boilerplate, and getting the ability to write your application in the language of your choosing.

Return on Investment

  • This has sped up the process of analysis.
  • We can now automate process which were performed manually.
  • Analysis can be performed more frequently.
  • Apache Spark has faster performance compared to MapReduce.
  • Combination of Python & Spark is the best. Shorter code, faster and efficient performance.
  • Can replace RDBMS

Pricing Details

1010data

General
Free Trial
Free/Freemium Version
Premium Consulting/Integration Services
Entry-level set up fee?
No
Additional Pricing Details

Apache Spark

General
Free Trial
Free/Freemium Version
Premium Consulting/Integration Services
Entry-level set up fee?
No
Additional Pricing Details