Nagios Empowered Monitoring
August 08, 2018
Nagios Empowered Monitoring
Score 10 out of 10
Overall Satisfaction with Nagios
As Nagios was the first monitoring system available for users about 15 years ago, I decided to implement the monitoring solution for a few dozen servers in the organization. However over time, the server-count had increased to 500+ while service-counts increased to 5000+. Nagios continued to remain stable for years on a simple dual-core(2Gb machine). Its ability to proactively detect issues in the system keeps our engineers informed hours (or days) ahead of a pending disaster.
As Nagios employes both Pull & Push Monitoring, implementing the plugins behind a firewall was never a hassle. Customizations were simple as any engineer with basic computer language knowledge can create plugins within minutes. I specifically choose bash, java & php as that's more familiar to me, while others chose Python,Perl or C#.
I have configured Nagios with the following technologies for better user experience.
- MySQL (Storage & Retrieval) using the NDOUtil
- NRDP (For push alerting when your servers are not accessible due to firewall rules)
- Pnp4Nagios (for basic RRD graphing - I have tweaked the RRD settings to allow granular data over months of storage)
- Grafana (for easy aggregated graphing, dashboards, heat-maps, alerts, user )
- Ability to monitor the Application Logic - Regardless of the language the application was written, a simple plugin script can be quickly constructed to measure the key matrix of a running application (memory, heap, cpu%, db-conns, limits, delays in functions).
- Open Source and the largest community of developers. There's a plugin for everything, including surveillance equipment, cameras, big-data analysis, AWS & Microsoft services. Over 10,000 plugins are available.
- The Nagios data can be stored and plotted to any serial graphing system. We chose Grafana as it supports query graphing & dashboards.
- Configuring and deploying the various open source plugins can be troublesome at first. It takes a bit of patience to connect all the various components (Nagios, NDOUtils, MySQL, NRDP, Pnp4Nagios, Batch-Processing, Grafana).
- Most configurations are done through the command & configuration files. Although it has exceptional tuning, there is a moderate learning curve.
- The Nagios UI might need better CSS styling as it still has the year 2005 look and feel. Although there are several mediocre UIs available, the heart of Nagios lies in monitoring.
- The only running cost for Nagios was the server hosting which was under 20$/month (for a few dozen servers/ hundreds of services).
- As our system had grown to 5000+ services, the charges were under 100$/month. The software and all components were open-source. As the Nagios community is large, getting help was just a few hours away.
- Due to the complexity involved with initial installation and configurations, an experienced Linux engineer is required during the initial stages.
We have tested several other monitoring products which were able to monitor the basic matrix (Memory, DiskUsage, CPU%, UpTime, Running Service Status, Port 80 Up/Down). Although some offered far better UIs, they lacked the ability to monitor ANYTHING. Zabbix, being the only contender worthy of competing, is a good alternative to Nagios. We also tried Zenoss Core & OpenNMS which were good enough for non-Linux engineers to get started with. OP5 was another service-oriented monitoring solution we evaluated. Apart from Nagios, Consul is heavily used to monitor & register the micro-service systems & end-point URLs. Due to the time invested (9+years) in Nagios, we were able to get more components installed/configured easily than alternatives.
Nagios monitoring is well suited for any mission critical application that requires per/second (or minute) monitoring. This would probably include even a shuttle launch. As Nagios was built around Linux, most (85%) plugins are Linux based, therefore its more suitable for a Linux environment.
As Nagios (and dependent components) requires complex configurations & compilations, an experienced Linux engineer would be needed to install all relevant components.
Any company that has hundreds (or thousands) of servers & services to monitor would require a stable monitoring solution like Nagios. I have seen Nagios used in extremely mediocre ways, but the core power lies when its fully configured with all remaining open-source components (i.e. MySQL, Grafana, NRDP etc). Nagios in the hands of an experienced Linux engineer can transform the organizations monitoring by taking preventative measures before a disaster strikes.