Useful Self-Hosted ETL tool for Event Driven Applications
March 19, 2018

Useful Self-Hosted ETL tool for Event Driven Applications

Jordan Moore | TrustRadius Reviewer
Score 8 out of 10
Vetted Review
Verified User

Overall Satisfaction with Logstash

My primary use case for Logstash is ingesting log files into a local Elasticsearch&Kibana Docker container so that I can easily search though the logs better. My favorite feature is the grok parser as it is easy to decompose complex regular expressions into simplified patterns. Logstash has a plethora of available plugins, but the out of the box connections have addressed all my needs thus far.
  • Plugin ecosystem allows modular extensions.
  • Tight integration into the Elastic.com products of Beats and Elasticsearch, so minimal setup is required when using those tools.
  • Filter plugins are powerful for extracting and enriching input data.
  • Since it's a Java product, JVM tuning must be done for handling high-load.
  • The persistent queue feature is nice, but I feel like most companies would want to use Kafka as a general storage location for persistent messages for all consumers to use. Using some pipeline of "Kafka input -> filter plugins -> Kafka output" seems like a good solution for data enrichment without needing to maintain a custom Kafka consumer to accomplish a similar feature.
  • I would like to see more documentation around creating a distributed Logstash cluster because I imagine for high ingestion use cases, that would be necessary.
  • Logstash has allowed me to ingest log files of various patterns into Elasticsearch for analysis using its flexible Grok parser.
  • I've been able to perform web analytics over datasets using Logstash's GeoIP and reverse DNS lookups.
  • By providing a simple mechanism for adding plugins, Logstash has allowed me to install extensions on top of those already pre-installed.
Logstash can be compared to other ETL frameworks or tools, but it is also complementary to several, for example, Kafka. I would not only suggest using Logstash when the rest of the ELK stack is available, but also for a self-hosted event collection pipeline for various searching systems such as Solr or Graylog, or even monitoring solutions built on top of Graphite or OpenTSDB.
Logstash is well suited for tight integration into the ELK stack, but it is also flexible enough to support other ingestion workloads similar to any other message bus or queueing framework. Compared to a message queue, though, Logstash also supports various filter and enrichment plugins that allow you to manipulate data as it passes through the system.