Try Splunk if you haven’t already, it’ll make devs and ops’ lives easier
May 02, 2021
Try Splunk if you haven’t already, it’ll make devs and ops’ lives easier

Score 8 out of 10
Vetted Review
Verified User
Software Version
SignalFx Microservices APM
Overall Satisfaction with Splunk Infrastructure Monitoring (formerly SignalFx)
We’re using Splunk [Infrastructure Monitoring (formerly SignalFx)] to report real-time metrics when the number/percentage of a specific event is important. For example, we use it to detect error cases or monitor real-time throughout of the system. We also use the detectors to get alerts when the condition is met. It’s widely used by almost all engineering teams
Pros
- Metric breakdowns
- Dashboards and charts
- Detectors and alerting mechanism
Cons
- Chart sharing needs an improvement because it creates links only valid for 7 days.
- Metrics search can be improved, most of the time I want to see which specific metric is used in which charts. This would make my life so much easier to find the charts that I’m looking for.
- There is a 5000 time series limit on each metric. If any metric has a breakdown more than 5000 combination, only some of them is reported and this make my charts sometimes unreliable. It would be nice to support more time series, at least with a configuration.
- Definetely MTTR
- Reduced downtime because when we get a no heartbeat alert, we jump in and resolve the issue ASAP
- Increased monitoring costs, Splunk is one of the 3 monitoring tools we use
They’re not for the same purpose but we’re using NewRelic and Honeycomb for monitoring purposes. NewRelic is used for HTTP client monitoring for system related throughput, error, database and external client monitoring. Honeycomb is used to monitor actual HTTP request/response values. Splunk [Infrastructure Monitoring] is used for real-time application related throughout and error monitoring.
Do you think Splunk Observability Cloud delivers good value for the price?
Yes
Are you happy with Splunk Observability Cloud's feature set?
Yes
Did Splunk Observability Cloud live up to sales and marketing promises?
I wasn't involved with the selection/purchase process
Did implementation of Splunk Observability Cloud go as expected?
I wasn't involved with the implementation phase
Would you buy Splunk Observability Cloud again?
Yes
We have external dependencies and our revenue depends on the responses we get from these external customers. With the help of an alerting feature, whenever we have an outage on external side, we can ping them to fix their system. This helps us to have shorter outages meaning that less money loss.
Our time is the one who uses the most of the Splunk quota. Most of the time operations team sends warning to use less number of time series if possible.
We’re very careful when choosing the right alerting conditions to prevent false alerting or alert storms. There are plenty of options to choose from, but we mostly use heartbeat checks and thresholds. These are the most reliable ones according to our experiences.
Comments
Please log in to join the conversation