Item: Elasticsearch
Rating: 7
Author: Verified User

Use Cases and Deployment Scope

Elasticsearch (ES) is being used to measure the performance metrics of our web crawlers for our web metrics department. They employ a series of crawlers: setting up data feeds to an ELK stack to measure and monitor organic messages related to our marketing campaigns. It primarily allows us to bring advanced analytics in-house.

Pros and Cons

Free of SQL: ES does not have the overhead of relying on SQL. In fact, you can use most (if not all) DBMs out there.
Java: Normally, this is not a strength: Java is slow and cumbersome. I believe in this case, it's truly a feature: by utilizing a language with universal support, it makes ES VERY DevOps friendly, simply by being able to focus on Problem-oriented vs Solutions-based thinking.
Although ES has been known to consume RAM, it's very flexible, and I have implemented on a number of distinct hardware configuration with success.
Linux: It's not locked down to an OS (which is the way of the future), and as a result-running it on Linux means you get the power of Linux, in a data science package.

Elastic Search IS a resource hog: most of the time, I will run ES on a dedicated VM (often a dedicated blade, too!) and allow the other components of the stack to run on separate blades/VMs.
Works great for small projects, but is NOT industrial strength: When you are performing a data architecture project, where you are capturing and mining datasets, ES is fine, until you start getting into much denser data sources (orders to TBs), such that ES will violate Data integrity.
It only supports JSON output: Which is very friendly to a lot of DevOps/Data Architecture projects but may become a hassle when your endpoints require CVS, XML, etc.

Return on Investment

(Negative) Expense: Just Time. Early on, I had issues getting it installed in an exotic distribution, so labor/hours invested.
(Positive) Configuration and Modularity: You don't *have* to implement a full ELK stack. In fact, you could just run one, two or three components. However, marrying up ES with something like Syslog-ng, you combine two very powerful, feature-rich software packages in their own right, into an amazingly powerful data collection and gathering tool.
(Positive) Shallow learning curve: if you can write your own Unix configuration files, you will be able to maintain and develop on Elasticsearch.

Alternatives Considered

Logstash, Redis, Jenkins, Ansible, Puppet Enterprise (formerly Puppet Data Center Automation), Chef and Loggly

ES does not compete with the above packages but compliments them. By automating and mining logs, you are able to get a sense of the business process, marketing data or whatever else you need to capture and mine. The potential energy stored within Elasticsearch makes it a great tool to include in your DevOps toolbox.

Other Software Used

Logstash, Loggly, Jenkins, Ansible

Likelihood to Recommend

Elasticsearch is great for development/research projects: It's fast, and *fairly* simple to set up. Project ideas of the calibre of: Watching a marketing feed from Twitter, or scraping sites. But for High availability in (say) a SCADA environment, probably not helpful. Though, I would recommend it for logging system nodes: such as a data center, trouble ticketing dashboard, or health/status visualizations.

Elasticsearch: A Great Lab / Development Platform for Data Architects and DevOps

Overall Satisfaction with Elasticsearch