Apache Spark vs. Elasticsearch

Overview
ProductRatingMost Used ByProduct SummaryStarting Price
Apache Spark
Score 8.6 out of 10
N/A
N/AN/A
Elasticsearch
Score 8.4 out of 10
N/A
Elasticsearch is an enterprise search tool from Elastic in Mountain View, California.
$16
per month
Pricing
Apache SparkElasticsearch
Editions & Modules
No answers on this topic
Standard
$16.00
per month
Gold
$19.00
per month
Platinum
$22.00
per month
Enterprise
Contact Sales
Offerings
Pricing Offerings
Apache SparkElasticsearch
Free Trial
NoNo
Free/Freemium Version
NoNo
Premium Consulting/Integration Services
NoNo
Entry-level Setup FeeNo setup feeNo setup fee
Additional Details
More Pricing Information
Community Pulse
Apache SparkElasticsearch
Considered Both Products
Apache Spark

No answer on this topic

Elasticsearch
Chose Elasticsearch
They all have their specific pros and cons. Elastic was actually initially brought in to provide less expensive functionality to Splunk, and Splunk use cases. Grafana was brought in to provide less expensive visualizations compared to Splunk and Elastic...I would recommend …
Top Pros
Top Cons
Best Alternatives
Apache SparkElasticsearch
Small Businesses

No answers on this topic

Algolia
Algolia
Score 8.9 out of 10
Medium-sized Companies
Cloudera Manager
Cloudera Manager
Score 9.7 out of 10
Guru
Guru
Score 9.0 out of 10
Enterprises
IBM Analytics Engine
IBM Analytics Engine
Score 8.8 out of 10
Guru
Guru
Score 9.0 out of 10
All AlternativesView all alternativesView all alternatives
User Ratings
Apache SparkElasticsearch
Likelihood to Recommend
9.9
(24 ratings)
9.0
(47 ratings)
Likelihood to Renew
10.0
(1 ratings)
10.0
(1 ratings)
Usability
10.0
(3 ratings)
10.0
(1 ratings)
Support Rating
8.7
(4 ratings)
7.8
(9 ratings)
Implementation Rating
-
(0 ratings)
9.0
(1 ratings)
User Testimonials
Apache SparkElasticsearch
Likelihood to Recommend
Apache
Well suited: To most of the local run of datasets and non-prod systems - scalability is not a problem at all. Including data from multiple types of data sources is an added advantage. MLlib is a decently nice built-in library that can be used for most of the ML tasks. Less appropriate: We had to work on a RecSys where the music dataset that we used was around 300+Gb in size. We faced memory-based issues. Few times we also got memory errors. Also the MLlib library does not have support for advanced analytics and deep-learning frameworks support. Understanding the internals of the working of Apache Spark for beginners is highly not possible.
Read full review
Elastic
Elasticsearch is a really scalable solution that can fit a lot of needs, but the bigger and/or those needs become, the more understanding & infrastructure you will need for your instance to be running correctly. Elasticsearch is not problem-free - you can get yourself in a lot of trouble if you are not following good practices and/or if are not managing the cluster correctly. Licensing is a big decision point here as Elasticsearch is a middleware component - be sure to read the licensing agreement of the version you want to try before you commit to it. Same goes for long-term support - be sure to keep yourself in the know for this aspect you may end up stuck with an unpatched version for years.
Read full review
Pros
Apache
  • Apache Spark makes processing very large data sets possible. It handles these data sets in a fairly quick manner.
  • Apache Spark does a fairly good job implementing machine learning models for larger data sets.
  • Apache Spark seems to be a rapidly advancing software, with the new features making the software ever more straight-forward to use.
Read full review
Elastic
  • As I mentioned before, Elasticsearch's flexible data model is unparalleled. You can nest fields as deeply as you want, have as many fields as you want, but whatever you want in those fields (as long as it stays the same type), and all of it will be searchable and you don't need to even declare a schema beforehand!
  • Elastic, the company behind Elasticsearch, is super strong financially and they have a great team of devs and product managers working on Elasticsearch. When I first started using ES 3 years ago, I was 90% impressed and knew it would be a good fit. 3 years later, I am 200% impressed and blown away by how far it has come and gotten even better. If there are features that are missing or you don't think it's fast enough right now, I bet it'll be suitable next year because the team behind it is so dang fast!
  • Elasticsearch is really, really stable. It takes a lot to bring down a cluster. It's self-balancing algorithms, leader-election system, self-healing properties are state of the art. We've never seen network failures or hard-drive corruption or CPU bugs bring down an ES cluster.
Read full review
Cons
Apache
  • Memory management. Very weak on that.
  • PySpark not as robust as scala with spark.
  • spark master HA is needed. Not as HA as it should be.
  • Locality should not be a necessity, but does help improvement. But would prefer no locality
Read full review
Elastic
  • Joining data requires duplicate de-normalized documents that make parent child relationships. It is hard and requires a lot of synchronizations
  • Tracking errors in the data in the logs can be hard, and sometimes recurring errors blow up the error logs
  • Schema changes require complete reindexing of an index
Read full review
Likelihood to Renew
Apache
Capacity of computing data in cluster and fast speed.
Read full review
Elastic
We're pretty heavily invested in ElasticSearch at this point, and there aren't any obvious negatives that would make us reconsider this decision.
Read full review
Usability
Apache
The only thing I dislike about spark's usability is the learning curve, there are many actions and transformations, however, its wide-range of uses for ETL processing, facility to integrate and it's multi-language support make this library a powerhouse for your data science solutions. It has especially aided us with its lightning-fast processing times.
Read full review
Elastic
To get started with Elasticsearch, you don't have to get very involved in configuring what really is an incredibly complex system under the hood. You simply install the package, run the service, and you're immediately able to begin using it. You don't need to learn any sort of query language to add data to Elasticsearch or perform some basic searching. If you're used to any sort of RESTful API, getting started with Elasticsearch is a breeze. If you've never interacted with a RESTful API directly, the journey may be a little more bumpy. Overall, though, it's incredibly simple to use for what it's doing under the covers.
Read full review
Support Rating
Apache
1. It integrates very well with scala or python. 2. It's very easy to understand SQL interoperability. 3. Apache is way faster than the other competitive technologies. 4. The support from the Apache community is very huge for Spark. 5. Execution times are faster as compared to others. 6. There are a large number of forums available for Apache Spark. 7. The code availability for Apache Spark is simpler and easy to gain access to. 8. Many organizations use Apache Spark, so many solutions are available for existing applications.
Read full review
Elastic
We've only used it as an opensource tooling. We did not purchase any additional support to roll out the elasticsearch software. When rolling out the application on our platform we've used the documentation which was available online. During our test phases we did not experience any bugs or issues so we did not rely on support at all.
Read full review
Implementation Rating
Apache
No answers on this topic
Elastic
Do not mix data and master roles. Dedicate at least 3 nodes just for Master
Read full review
Alternatives Considered
Apache
All the above systems work quite well on big data transformations whereas Spark really shines with its bigger API support and its ability to read from and write to multiple data sources. Using Spark one can easily switch between declarative versus imperative versus functional type programming easily based on the situation. Also it doesn't need special data ingestion or indexing pre-processing like Presto. Combining it with Jupyter Notebooks (https://github.com/jupyter-incubator/sparkmagic), one can develop the Spark code in an interactive manner in Scala or Python
Read full review
Elastic
As far as we are concerned, Elasticsearch is the gold standard and we have barely evaluated any alternatives. You could consider it an alternative to a relational or NoSQL database, so in cases where those suffice, you don't need Elasticsearch. But if you want powerful text-based search capabilities across large data sets, Elasticsearch is the way to go.
Read full review
Return on Investment
Apache
  • Faster turn around on feature development, we have seen a noticeable improvement in our agile development since using Spark.
  • Easy adoption, having multiple departments use the same underlying technology even if the use cases are very different allows for more commonality amongst applications which definitely makes the operations team happy.
  • Performance, we have been able to make some applications run over 20x faster since switching to Spark. This has saved us time, headaches, and operating costs.
Read full review
Elastic
  • We have had great luck with implementing Elasticsearch for our search and analytics use cases.
  • While the operational burden is not minimal, operating a cluster of servers, using a custom query language, writing Elasticsearch-specific bulk insert code, the performance and the relative operational ease of Elasticsearch are unparalleled.
  • We've easily saved hundreds of thousands of dollars implementing Elasticsearch vs. RDBMS vs. other no-SQL solutions for our specific set of problems.
Read full review
ScreenShots