Apache Flume is well suited when the use case is log data ingestion and aggregate only, for example for compliance of configuration management. It is not well suited where you need a general-purpose real-time data ingestion pipeline that can receive log data and other forms of data streams (eg IoT, messages).
Apache Flume is on par with Scribe with similar functions. Apache Kafka is a generation purpose while Apache Flume is specific to log aggregation. Google Pub/Sub and IBM MQ are costlier than Apache Flume ( open source ) and have a lot more cost associated with them. Apama Streaming Analytics and Tibco Steaming are more comprehensive streaming solutions than Apache Flume so for deeper performance guarantees, it is easier to use Apache Flume.
Apache Flume is a key software piece in BigData environments, we have used it along with CDC (Change Data Capture) to ingest near real time database changes into Kafka so the data is available for realtime analysis, machine learning, dynamic dashboards and so
We have successfully integrated also Apache Flume in log acquisition solutions (mainly PaaS and Docker) where application log is difficult access.
Apache Flume is well suited in small batch and near real time processing projects, taking data from one point to another with local processing (I mean not external enrichment).
Filtering, transforming and multiple push destinations are common grounds for Flume.
It is not so nice to use if your data needs external enrichment (taking data from external databases or web services), as transactions and (micro)batches may lead to reprocessing and it relies upon the application to avoid duplicates.
Apache Flume is a very good solution when your project is not very complex at transformation and enrichment, and good if you have an external management suite like Cloudera, Hortonworks, etc. But it is not a real EAI or ETL like AB Initio or Attunity so
you need to know exactly what you want. On the other hand being an opensource project give Apache a lot of room to personalize thanks to its plug-able architecture and has a very nice performance having a very low CPU and Memory footprint, a single server can do the job on many occasions, as opposed to the multi-server architecture of paid products.