The De-Facto Standard
August 01, 2018

The De-Facto Standard

Anonymous | TrustRadius Reviewer
Score 9 out of 10
Vetted Review
Verified User

Overall Satisfaction with PagerDuty

We use PagerDuty to alert engineers when our users are experiencing degraded service or service outages. It is used by the engineering and product teams. For critical issues, the teams that own the affected services are notified first. Unacknowledged alerts are escalated to other members of the engineering team.
  • PagerDuty has simple integrations for many of the other services we use, including AWS and New Relic.
  • It is easy to manage on-call schedules, alert poliices and contact information.
  • PagerDuty is reliable; we've never experienced downtime.
  • The interface for switching on-call shifts could be easier to use.
  • The positive impact is clear. Alerting is essential; we cannot wait to hear from our users to learn of problems. Building reliable alerting systems is not trivial. Delegating this work to PagerDuty allows engineers to focus on our product.
CloudWatch and New Relic are not exactly comparable, but it is possible to use them to create some alert. PagerDuty has more features, including on-call schedules and escalation policies.
PagerDuty is commonly used in technology companies; most engineers are accustomed to it and it works well. It is suited to alerting teams when SaaS is degraded or unavailable.

Using PagerDuty

30 - Engineers, data scientists and product managers use PagerDuty. The engineers and product manager who own the feature or service are the first to be notified. The head of engineering and CTO are the last on the escalation policy.
10 - After the initial policies and integrations are configured, little on-going management is required. Most engineers are capable of configuring new integrations when the need arises. One devops engineer created the initial integrations.
  • Alerting engineers and product managers when the user experience is degraded.
  • Alerting engineers when thresholds for maintenance are exceeded.
  • Helping engineers and product managers understand who is on-call and who to contact.
  • The inclusion of product managers in our escalation policies might be somewhat unusual. Otherwise, we use PagerDuty in a narrow, predictable way.
Understanding when the customer experience is degraded is essential to our business. Everyone understands how to use PagerDuty and it is integrated with our playbooks for responding to emergencies. PagerDuty is reliable, and provides the features we require.

Using PagerDuty

The UI is more complex than I would like. Part of the challenge is that most users use PagerDuty infrequently; I don't remember how I changed a policy last time. Another part of the challenge is that some users expect alerting to be a trivial feature, and are reluctant to invest any time in reading the documentation.
ProsCons
Technical support not required
Well integrated
Consistent
Feel confident using
Unnecessarily complex
Difficult to use
Slow to learn
Lots to learn
  • It is easy to acknowledge and resolve alerts.
  • It is easy to understand who is on-call, and when they are on-call.
  • It is easy to integrate with many, but not all, of the products and services we use.
  • Some services require an API integration that is confusing to configure.
  • One-off changes to the on-call schedule are confusing to configure. Most users don't make these changes often enough to remember how the next time they need to.
Yes - The mobile interface is adequate for viewing, acknowledging and resolving alerts. It would not be suitable for configuring alerting policies or integrations. Most of the limitations of the mobile interface are acceptable because users will require a desktop experience for the other tools they are using.