A Dream for Service Providers with Complicated On-Call Scheduling and Event Management
Updated August 29, 2023

A Dream for Service Providers with Complicated On-Call Scheduling and Event Management

Anonymous | TrustRadius Reviewer
Score 9 out of 10
Vetted Review
Verified User

Overall Satisfaction with PagerDuty

The PagerDuty platform is used across the organization for on-call management, event/alert management and automated escalations. The platform solved a significant inefficiency within the business related to on-call and escalations. Prior to PagerDuty, escalations were manual and not well documented. This made it difficult to hold people accountable and properly document problems. Through PagerDuty's escalation policies, on-call schedules and general workflow automation capabilities, the problem was solved fairly quick. Because our on-call team members are paid each time they answer an on-call request, this made tracking the notifications much easier.
  • Automated escalations and notifications.
  • Centralized event management through machine learning and rules.
  • Emergency operations team mobilization and engagement.
  • Simple, easy to use interface without complicated system management.
  • More flexible licensing models.
  • Reporting capabilities aren't as deep or as rich as other platforms.
  • Outages while seldom do happen on occasion.
  • Saved about $50k USD per annum on event management only. This does not include all the additional benefits of things like on-call, automated escalations, etc. we did not have previously.
  • Reduced on prem infrastructure footprint by about $6k USD a month.
Generally speaking the PagerDuty platform is reliable. Outages do happen on occasion. These outages are significantly felt by our organization even if it is for a few minutes. Because we leverage PagerDuty as our "eyes on glass" platform, we often see problems well before PagerDuty support identifies the problem. I can only recall one situation where the platform was completely down. Typically it is one or two minor components in a degraded state.
We do not use any of the native/pre-built integration capabilities. We have built our own integrations using the events API and webhooks. This gives us additional flexibility how information is sent/received from the PagerDuty platform. We have tested some of the native integrations but found they lacked flexibility. The custom code we developed is deployed on a web services engine backed by service bus technology. For most organizations this is likely overkill but necessary if you are integrated multiple tools together.
PagerDuty is our first response platform. Therefore, all alerts and events flow through the platform first before being escalated to a person. All events stream in to the front-end UI and are picked up manually by our tier 1 staff. If the tier 1 staff decide the problem requires immediate escalation to a higher tier team member, they will invoke an escalation policy for the appropriate resource. The higher tier staff decide how they want to be notified when a low or high priority incident is escalated to them i.e. notifications via phone, email, text and push notification or some combination of them all. PagerDuty is our system of record for all events and alerts. A separate tool is used for our incident system of record. We have build a webhook that will generate an incident in our ITSM platform.
We leverage some of the basic features of PagerDuty analytics but not the more advance capabilities. As a service provider, we found the capability lacked characteristics required to see customer data independently. In discussions with PagerDuty Product Management, some of the features we were looking for were on the horizon. We have not revisited the more advanced features in about a year.
When we selected PagerDuty, we evaluated a few other solutions including Moogsoft, BigPanda, VictorOps and Splunk Enterprise. We decided on PagerDuty specifically for the automated on-call escalation capabilities. At the time when we subscribed to PagerDuty, event management was still very early and it was not a core strength. In comparison, event/alert management is a core competency of the other platforms. However, in the other platforms lacked automated on-call capabilities or partnered with PagerDuty to deliver the capability. To carry two solutions didn't make financial sense at the time. Therefore, we went down the path of PagerDuty leveraging the beta event management capabilities. Since then, the event management capabilities are very sound and exactly what we need.
Generally speaking support is good overall. We rarely have bugs or platform problems that require support engagement. We find the status page to be very valuable and a great way to perform self-service status checks.

Do you think PagerDuty delivers good value for the price?

Yes

Are you happy with PagerDuty's feature set?

Yes

Did PagerDuty live up to sales and marketing promises?

Yes

Did implementation of PagerDuty go as expected?

Yes

Would you buy PagerDuty again?

Yes

If an organization lacks consistency in how on-call operations are handled or does not have a centralized way to mobilize resources quickly then PagerDuty is a fantastic option. While the platform was not originally designed for managed service providers, it can be adopted and adjusted to support multiple customers and partners. For organizations with a larger resource footprint with a wide variety of skillsets, PagerDuty makes it a breeze to get organized and get the right people on the phone when it matters.