TrustRadius: an HG Insights company

Apache Flume

Score7.1 out of 10

9 Reviews and Ratings

What is Apache Flume?

Apache Flume is a product enabling the flow of logs and other data into a Hadoop environment.


Categories & Use Cases

Apache Flume, the way your information flows

Pros

  • Multiple sources of data (sources) and destinations (sinks) that allows you to move data form and to any relevant data storage
  • It is very easy to setup and run
  • Very open to personalization, you can create filters, enrichment, new sources and destinations

Cons

  • Apache Flume develops new functionality at a slower pace than other OpenSource projects, it is well behing Kafka and has some compatibiliy issues with latest releases
  • It lack HA or FT, it relies on third party management software like Hortonworks or Cloudera

Return on Investment

  • Flume has simplified a lot many of our ingest procedures, easier to deploy and integrate than a classical EAI, reducing the time to market
  • But opposed to EAIs if the project starts to grow in complexity Apache Flume project may not be as suitable

Alternatives Considered

Logstash

Other Software Used

Apache Kafka, Logstash, TIBCO BusinessWorks, TIBCO Enterprise Message Service

Apache Flume for log aggregation and compliance monitoring in real-time

Pros

  • Apache Flume being a log-centric system, it is able to parse and aggregate log data very well.
  • It is easy to customize it for different source (producers) for log data ingestion as well as for sinks (consumers).

Cons

  • It is very specific for log data ingestion so it is pretty hard to use for anything else besides log data
  • Data replication is not built in and needs to be added on top of Apache Flume (not a hard job to do though)

Return on Investment

  • Positive impact on ROI due to a reduction in manual labor to generate and maintain compliance reports based on logs.
  • Positive impact on the business objective by reducing the need for provisioning compute for log aggregate IT stack in advance but adding on an as-needed basis.

Alternatives Considered

TIBCO Streaming (StreamBase), Apache Kafka, Google Cloud Pub/Sub, IBM MQ and Apama Streaming Analytics

Other Software Used

Apama Streaming Analytics, TIBCO Streaming (StreamBase)