A Closer Look: What’s Inside an APM Product

November 25th, 2019

What are Application Performance Management (APM) tools? APM tools are any type of software that help you manage and monitor the performance of your code, application dependencies, transaction times, and user experiences. APM platforms typically contain a range of different capabilities. Standard capabilities of these products include digital experience monitoring, application discovery, tracing and diagnostics, application analytics, and AI for IT operations. Below we’ll define each capability.

Digital Experience Monitoring

The experience of application performance and availability whether by a human or a digital agent is the bedrock of APM. Tools must be able to monitor and troubleshoot application performance from the perspective of user experience.

Application Discovery, Tracing, and Diagnostics

These are the core APM services providing the most value to customers. The APM tools should be capable of discovering the application topology—a map of the layout of mission-critical applications in an enterprise, how applications are connected, how they can be accessed by various computers and networks, and how they are currently performing from an availability perspective. Tracing refers to a technique for monitoring microservice environments. When apps comprise many different microservices, figuring out the source of slowdowns is much more complex. Tracing, or “distributed request tracing”, is an emerging microservices monitoring technique that helps IT and DevOps teams manage distributed applications. Essentially, the application code is instrumented so that transaction paths can be followed and analyzed automatically throughout the distributed system.

Application Analytics

Application analytics is the capability of the APM tool to drill down and provide root cause analysis. APM tools generate huge pools of performance data that can be difficult to fully leverage and understand by humans. For this reason, built-in analytics tools provide visibility across silos and the ability to investigate deeper using AI powered by machine learning. Automatic detection of the source of transaction performance problems using statistical analysis, machine learning, or pattern recognition enables APM tools to provide the all- important view as to why a problem is occurring and prevent it from happening again.

AI for IT Operations

Today, data volumes are too large to monitor by hand. Next generation monitoring solutions must deal with applications that are dynamic and complex. Modern applications are based on small, distributed and containerized microservices doing small units of work. The result is a very large and constantly changing landscape where manual processes to diagnose problems promptly are no longer viable. Machine learning algorithms and statistical inference can help IT Ops automate cope with these complex environments by automating potential root cause prediction, detection of performance anomalies, and pattern recognition. Remediation actions can then be taken to fix nascent problems in production environments before they become critical.

Tips for Buyers

While each of these capabilities is important, it is crucial to consider which capabilities are most important to your specific case. For example, tracing and diagnostics capabilities are provided by virtually all vendors but some products are better than others in this regard. If this is the key purchase criterion, buyers should carefully compare these capabilities to understand which products provide the level of service required. 

There are also several other criteria to bear in mind. For example, usability and support. Neither of these describe product functional capabilities, but can be critically important factors in achieving success. If a product has a non-intuitive user interface or very poor support, the sophistication of the core feature set may be moot.

The best way to understand factors like product usability, scalability or vendor support services, is to read reviews written by actual users who frequently critical factors of this kind.