Observability Done Right
Overall Satisfaction with Datadog
In our environment, Datadog is a core part of our observability stack for application performance and infrastructure. We are using RUM for monitoring customer performance in web applications and identifying issues. For API, we have an APM monitor to check latency, track all endpoints and status codes, and the success rate. We have comprehensive AWS infrastructure monitoring, including EC2 instances, RDS MySQL Clusters, NLB/ALB, VPC flows, and Lambda functions. Additionally, APM for Java Microservices includes tracing, JVM Heap, GC, and thread pools. We are also utilizing it for log analysis and on-call services, including custom logs for settlement jobs, credit applications, normalization for fast searching, resolving deadlocks, and handling 5xx bursts. We are also using WAF for application endpoint security.
Pros
- Log Management
- APM - Application Process Management
- Infrastructure Monitoring
- Security Monitoring
- Dashboards, custom monitors, real time visibility
- Alerting & On-Call Services - teams, email, phone, app popup
Cons
- LLM
- AI Observibiltiy
- Endpoint Security
- Automated QA Testing, Synthetic Testing
- Datadog has had a very positive ROI for us because it direclty reduced downtime, improved customer experience, and helped our team to delete the issues early, and operate more efficienlty.
- Engineers can quickly correlate logs, metrics, and traces in one places instead of spending hours seraching across servers.
- infrastructure side, visibility into CPU, memory, and RDS usage helped us right-size several EC2 instances, resulting in ~10–15% cloud cost savings.
First think first - it's easy to use, and very easy to implement in any infrastructure. It provides a custom dashboard and monitors. I’ve used or evaluated Grafana, Prometheus, Amazon CloudWatch, and Dynatrace, and each tool has strong capabilities. Prometheus + Grafana provide solid open-source metric collection and visualization, but they require more maintenance and don’t offer native logs + traces out of the box. CloudWatch integrates well with AWS, but becomes difficult for deep APM, log correlation, or cross-service troubleshooting in large distributed systems. Dynatrace offers powerful automation and root-cause analysis but is significantly more complex to implement and manage.
Do you think Datadog delivers good value for the price?
Yes
Are you happy with Datadog's feature set?
Yes
Did Datadog live up to sales and marketing promises?
Yes
Did implementation of Datadog go as expected?
Yes
Would you buy Datadog again?
Yes

Comments
Please log in to join the conversation