Apache Kafka is an open-source stream processing platform developed by the Apache Software Foundation written in Scala and Java. The Kafka event streaming platform is used by thousands of companies for high-performance data pipelines, streaming analytics, data integration, and mission-critical applications.
N/A
Apache Solr
Score 8.7 out of 10
N/A
Apache Solr is an open-source enterprise search server.
N/A
Elasticsearch
Score 8.5 out of 10
N/A
Elasticsearch is an enterprise search tool from Elastic in Mountain View, California.
Between Solr and ElasticSearch, there is a constant struggle to pick the best one. ElasticSearch is part of ELK and ties in well with LogStash and Kibana which makes it great for logs and big data stuff. Add some logs and see which works best for your particular access methods …
We tried to use both Elasticsearch and Swiftype with Drupal 8 but there are currently no good modules that integrate Drupal with those solutions. So Solr was really the only option for a Drupal 8 web site. It's not as easy to learn or use as Swiftype, but in the end I think it …
We tryed to promote Redis as cache solution for application, in order to replace Apache Solr, but it won't go well. Redis best pratices requires some more computer resources. With Elastic Search, the use case was another, and don't compete with Apache Solr.
Apache Solr in general stacks up very well to its competitors, it provides much of the same features and performance and has the benefits of being an open-source project with an active contributor base that works consistently and improves the platform. Depending on your setup …
Apache Solr is the closest competitor to ElasticSearch from a search engine perspective. ElasticSearch is simple and streamlined in it's configuration. When taken as a whole, Apache Solr is more robust as a storage engine from a developer perspective, ElasticSearch has the …
Elasticsearch is relatedly cheaper the splunk. OpenSearch is good and we migrated some data into it but the critical data stays in elasticsearch as it has formal support.
They all have their specific pros and cons. Elastic was actually initially brought in to provide less expensive functionality to Splunk, and Splunk use cases. Grafana was brought in to provide less expensive visualizations compared to Splunk and Elastic...I would recommend …
Elasticsearch is the most well-known and supported free data platform that we identified. We are taking advantage of community knowledge and practices. In terms of flexibility and breadth of use cases no other competitor came close to Elasticsearch. We've tried Solr in the past …
Elasticsearch and Solr are both based on Lucene, but the user community for Elasticsearch is much stronger, and setting up a cluster is easier. Splunk is very well suited for Log indexing and searching but is not nearly as flexible as Elasticsearch. Couchbase is a great NoSQL …
Elasticsearch is very well packed in a broad set of features, ranging from customization capabilities to security and add-ons, and also comes with a great visualization tool named Kibana. Most of the competitors are strong in some of these areas, but I know of no other that's …
Almost no one uses Solr anymore--most have migrated to Elasticsearch. I've never tried it myself but I heard Solr is much more difficult to configure and because it doesn't use a REST API, it locks you into Java and XML. XML--ick! Lucene: Elasticsearch is built using Lucene …
All database systems have things they are good at, and things they aren't as good at. Riak/SOLR is great as a K/V store, but SOLR cannot handle requests as fast as ElasticSearch. In fact, SOLR is the reason we had to migrate to ElasticSearch. Redis is great at SET operations …
When we first evaluated Elasticsearch, we compared it with alternatives like traditional RDBMS products (Postgres, MySQL) as well as other noSQL solutions like Cassandra & MongoDB. For our use case, Elasticsearch delivered on two fronts. First, we got a world-class search …
NEST library is excellent, excellent performance, and scalability (we used a cluster of 2 nodes, and most the queries completed in ms, some may take up to 2s.
Elasticsearch is widely popular and it's mostly free. Its ecosystem, ability to scale, ease to set up, integration with other systems, highly usable API make it really great compared to its competition.
Elasticsearch is DevOps friendly; it is easy for installation and management of a node/cluster. It is very friendly for developers by providing the REST API out of the box, reducing the development time.
Elasticsearch is based off of Apache Lucene. You get the same power as well as a JSON response. REST API is simple and easy to understand. Other options include XML responses which is much more complicated to parse at times.
For our application, ElasticSearch fulfilled all the criteria we were looking for. Something that's easy to scale and flexible. I think ElasticSearch works better that Solr with modern real-time search applications. Also, ElasticSearch is easy to integrate with. ElasticSearch …
Solr is the only other alternative product I've used. Elasticsearch in comparison is a much better product. The query language in elasticsearch along with the cluster management and sharding makes Elasticsearch a clear winner.
Ability to support JSON queries, Percolator, ease to set up and custom routing were some of the reasons why we decided to use Elasticsearch instead of Solr.
Apache Kafka is well-suited for most data-streaming use cases. Amazon Kinesis and Azure EventHubs, unless you have a specific use case where using those cloud PaAS for your data lakes, once set up well, Apache Kafka will take care of everything else in the background. Azure EventHubs, is good for cross-cloud use cases, and Amazon Kinesis - I have no real-world experience. But I believe it is the same.
Solr spins up nicely and works effectively for small enterprise environments providing helpful mechanisms for fuzzy searches and facetted searching. For larger enterprises with complex business solutions you'll find the need to hire an expert Solr engineer to optimize the powerful platform to your needs. Internationalization is tricky with Solr and many hosting solutions may limit you to a latin character set.
Elasticsearch is a really scalable solution that can fit a lot of needs, but the bigger and/or those needs become, the more understanding & infrastructure you will need for your instance to be running correctly. Elasticsearch is not problem-free - you can get yourself in a lot of trouble if you are not following good practices and/or if are not managing the cluster correctly. Licensing is a big decision point here as Elasticsearch is a middleware component - be sure to read the licensing agreement of the version you want to try before you commit to it. Same goes for long-term support - be sure to keep yourself in the know for this aspect you may end up stuck with an unpatched version for years.
Really easy to configure. I've used other message brokers such as RabbitMQ and compared to them, Kafka's configurations are very easy to understand and tweak.
Very scalable: easily configured to run on multiple nodes allowing for ease of parallelism (assuming your queues/topics don't have to be consumed in the exact same order the messages were delivered)
Not exactly a feature, but I trust Kafka will be around for at least another decade because active development has continued to be strong and there's a lot of financial backing from Confluent and LinkedIn, and probably many other companies who are using it (which, anecdotally, is many).
Easy to get started with Apache Solr. Whether it is tackling a setup issue or trying to learn some of the more advanced features, there are plenty of resources to help you out and get you going.
Performance. Apache Solr allows for a lot of custom tuning (if needed) and provides great out of the box performance for searching on large data sets.
Maintenance. After setting up Solr in a production environment there are plenty of tools provided to help you maintain and update your application. Apache Solr comes with great fault tolerance built in and has proven to be very reliable.
As I mentioned before, Elasticsearch's flexible data model is unparalleled. You can nest fields as deeply as you want, have as many fields as you want, but whatever you want in those fields (as long as it stays the same type), and all of it will be searchable and you don't need to even declare a schema beforehand!
Elastic, the company behind Elasticsearch, is super strong financially and they have a great team of devs and product managers working on Elasticsearch. When I first started using ES 3 years ago, I was 90% impressed and knew it would be a good fit. 3 years later, I am 200% impressed and blown away by how far it has come and gotten even better. If there are features that are missing or you don't think it's fast enough right now, I bet it'll be suitable next year because the team behind it is so dang fast!
Elasticsearch is really, really stable. It takes a lot to bring down a cluster. It's self-balancing algorithms, leader-election system, self-healing properties are state of the art. We've never seen network failures or hard-drive corruption or CPU bugs bring down an ES cluster.
Sometimes it becomes difficult to monitor our Kafka deployments. We've been able to overcome it largely using AWS MSK, a managed service for Apache Kafka, but a separate monitoring dashboard would have been great.
Simplify the process for local deployment of Kafka and provide a user interface to get visibility into the different topics and the messages being processed.
Learning curve around creation of broker and topics could be simplified
These examples are due to the way we use Apache Solr. I think we have had the same problems with other NoSQL databases (but perhaps not the same solution). High data volumes of data and a lot of users were the causes.
We have lot of classifications and lot of data for each classification. This gave us several problems:
First: We couldn't keep all our data in Solr. Then we have all data in our MySQL DB and searching data in Solr. So we need to be sure to update and match the 2 databases in the same time.
Second: We needed several load balanced Solr databases.
Third: We needed to update all the databases and keep old data status.
If I don't speak about problems due to our lack of experience, the main Solr problem came from frequency of updates vs validation of several database. We encountered several locks due to this (our ops team didn't want to use real clustering, so all DB weren't updated). Problem messages were not always clear and we several days to understand the problems.
Apache Kafka is highly recommended to develop loosely coupled, real-time processing applications. Also, Apache Kafka provides property based configuration. Producer, Consumer and broker contain their own separate property file
It takes some time to deploy and currectly maintein it. And also, to learn how to use and integrate in the enviroment as well. Once you get theses steps done, it usability is very simple, and almost of the time it don't require no further attention on it. Even for maintence, if you deploy it on a cluster mode, it is very reliable and easy to take one host down.
To get started with Elasticsearch, you don't have to get very involved in configuring what really is an incredibly complex system under the hood. You simply install the package, run the service, and you're immediately able to begin using it. You don't need to learn any sort of query language to add data to Elasticsearch or perform some basic searching. If you're used to any sort of RESTful API, getting started with Elasticsearch is a breeze. If you've never interacted with a RESTful API directly, the journey may be a little more bumpy. Overall, though, it's incredibly simple to use for what it's doing under the covers.
Support for Apache Kafka (if willing to pay) is available from Confluent that includes the same time that created Kafka at Linkedin so they know this software in and out. Moreover, Apache Kafka is well known and best practices documents and deployment scenarios are easily available for download. For example, from eBay, Linkedin, Uber, and NYTimes.
We've only used it as an opensource tooling. We did not purchase any additional support to roll out the elasticsearch software. When rolling out the application on our platform we've used the documentation which was available online. During our test phases we did not experience any bugs or issues so we did not rely on support at all.
I used other messaging/queue solutions that are a lot more basic than Confluent Kafka, as well as another solution that is no longer in the market called Xively, which was bought and "buried" by Google. In comparison, these solutions offer way fewer functionalities and respond to other needs.
We tried to use both Elasticsearch and Swiftype with Drupal 8 but there are currently no good modules that integrate Drupal with those solutions. So Solr was really the only option for a Drupal 8 web site. It's not as easy to learn or use as Swiftype, but in the end I think it will be a little less expensive and offer more customization and flexibility.
As far as we are concerned, Elasticsearch is the gold standard and we have barely evaluated any alternatives. You could consider it an alternative to a relational or NoSQL database, so in cases where those suffice, you don't need Elasticsearch. But if you want powerful text-based search capabilities across large data sets, Elasticsearch is the way to go.
Positive: Get a quick and reliable pub/sub model implemented - data across components flows easily.
Positive: it's scalable so we can develop small and scale for real-world scenarios
Negative: it's easy to get into a confusing situation if you are not experienced yet or something strange has happened (rare, but it does). Troubleshooting such situations can take time and effort.
We have had great luck with implementing Elasticsearch for our search and analytics use cases.
While the operational burden is not minimal, operating a cluster of servers, using a custom query language, writing Elasticsearch-specific bulk insert code, the performance and the relative operational ease of Elasticsearch are unparalleled.
We've easily saved hundreds of thousands of dollars implementing Elasticsearch vs. RDBMS vs. other no-SQL solutions for our specific set of problems.