Likelihood to Recommend Icinga is a world-class monitoring system. It can be used for most general monitoring situations. It is not a silver bullet, however, and there are instances where domain-specific monitoring systems are necessary. However, the output from those monitoring systems can be funneled into Icinga as a central monitoring and alerting system.
Read full review Splunk Infrastructure Monitoring is well suited for any complicated environment where you have apps and servers across multiple clouds and platforms and products. If you have a data centre where all your apps and servers are in one single network, you could probably get away with older solutions. But for any modern, complex, hybrid-cloud microservices environment, Splunk Infrastructure Monitoring is a must-have.
Read full review Pros Wealth of community-developed plugins. Stable codebase. Icinga 2 supports distributed monitoring. Very performant, can support tens of thousands of checks per server. Read full review SignalFX handles historical metric aggregation exceptionally well, providing a multifaceted approach to event detection based on anomalies. SignalFX's cost is incredibly flexible with their pricing model of DPM (data-point per minute) vs the traditional "per host" model that most monitoring SaaS use. SignalFX support is responsive and knowledgeable, very eager to help solve your immediate problems. SignalFX integrations is vast and constantly growing, making adoption easy even when multiple different open-source technologies are used in your stack. Read full review Cons High learning curve, setting up Icinga from scratch can be a bit of a challenge starting out. If the io2db process fails you UI stops updating, which can be very frustrating. There is no simple mechanism for adding new hosts and services through the web UI, it's all very config-file based. Read full review Better integration with our clients (native mobile clients SDKs will be great). A way to easily tag filters and move them across metrics/formula. Alert system to easy false [alarm] and hard to configure. Be able to have group more dimensions (and have more values on each). Read full review Likelihood to Renew Icinga is a solid solution which does everything it promises. It is backwards compatible with most Nagios instances, making the transition very easy. Once you get the hang of installing new plugins and editing configuration files expanding its monitoring capabilities are easy.
Read full review Good: Stable system with low error rate Easy to use for simple use cases Bad: UI is not very clear for complex usage Mobile view (when logged in from phone) is bad No library for .net
Read full review Usability I find that learning the interface can take some time. We need a better show-and-tell on how the Teams pages, Dashboard Groups, Dashboards and charts delay. Advance SignalFlow is sometimes hard to build. Some better samples of advanced SignalFlow would be helpful. For example, Splunk SPL has a vast resource of examples.
Read full review Alternatives Considered Icinga is better than
Nagios because of its nicer user interface. New Relic can monitor CPU/memory and disk usage, but it's more of a performance and application troubleshooting tool rather than monitoring
Read full review They’re not for the same purpose but we’re using NewRelic and Honeycomb for monitoring purposes. NewRelic is used for HTTP client monitoring for system related throughput, error, database and external client monitoring. Honeycomb is used to monitor actual HTTP request/response values. Splunk [Infrastructure Monitoring] is used for real-time application related throughout and error monitoring.
Read full review Return on Investment With one check you know which applications are faulty e.g. after an upgrade. Which is big time saver You easily detect outages ion the applications so that your customer ideally does not even realize there was an outage. Detect if the environment does deliver the same result as in the same time as before to detect shortages. Additional information when debugging. Saved us several hours where we could simply point to a database which was slow. Read full review Reduced downtime. Caused us to get a lot of spam when we redeployed apps and old instances stopped sending metrics. Muting alerts solves this, but people often forget to do it or do it incorrectly. Helped us find historical info about instances/apps. Read full review ScreenShots Splunk Infrastructure Monitoring Screenshots