- In part 1 he explores the attributes of APM systems, describes anti-patterns in system monitoring, presents methods for monitoring the performance of JVMs, and offers techniques for efficiently instrumenting application source code
- In part 2 he reviews post-compilation instrumentation, specifically through interception, class wrapping, and bytecode instrumentation
- In part 3 he concludes by discussing performance and availability monitoring of an application's ecosystem
- Blind spots: monitoring some, but not all of an environment leads to inconclusive results during analysis
- Black boxes: similar to blind spots, but scoped to applications or components. A black box is a component in which the monitoring solution does not have visibility into its internal performance
- Disjointed and disconnected monitoring systems: this anti-pattern contrasts siloed monitoring with consolidated monitoring - deep, but disjointed, monitoring of specific application stacks (e.g. an operating system, JVM, or database) can make it difficult to identify the true root cause of a performance problem. Whitehead presents a figure that illustrates this point nicely:
- After-the-fact reporting and correlation: attempting to extract data from disparate monitoring tools and correlate their results into something meaningful can be very challenging
- Periodic or on-demand monitoring: many monitoring solutions have sufficiently high overhead and therefore are only configured to run after a problem occurs. In this scenario, the monitoring may be too late to identify the root cause of a problem
- Non-persistent monitoring: live displays of performance metrics are great, but unless the data can be persisted, it is difficult to establish historical context when reviewing current performance metrics
- Reliance on preproduction monitoring: monitoring in preproduction is a good thing, but relying solely on preproduction monitoring is insufficient because user behavior cannot be fully anticipated
- Pervasive: It monitors all application components and dependencies.
- Granular: It can monitor extremely low-level functions.
- Consolidated: All collected measurements are routed to the same logical APM supporting a consolidated view.
- Constant: It monitors 24 hours a day, 7 days a week.
- Efficient: The collection of performance data does not detrimentally influence the target of the monitoring.
- Real-time: The monitored resource metrics can be visualized, reported, and alerted on in real time.
- Historical: The monitored resource metrics are persisted to a data store so historical data can be visualized, compared, and reported.
Whitehead dives deeply into monitoring specifics by reviewing the core JVM MBeans and constructs a monitoring framework for gathering those and application-specific JMX metrics. He subsequently turns his attention to monitoring classes and methods and reviews the four common technologies:
- Source code instrumentation: manually adding instrumentation to your application
- Interception: intercepting calls as they are made, such as through AOP, and capturing instrumentation metrics
- Bytecode instrumentation: modifying the bytecode of an application at runtime to inject performance collectors
- Class wrapping: wrapping or replacing a target class with another class that contains instrumentation logic
In part 2 he turns his attention to post-compilation instrumentation. He reviews how to use the EJB3 interceptors, servlet filter interceptors, EJB client-side interceptors and context passing, and Spring interceptors to capture application performance metrics. He describes how to use class wrapping of the JDBC driver, connection, statement, and result set objects to instrument JDBC, and hence, database calls. And finally he describes how byte code instrumentation (BCI) works and how the JVM provides a standard mechanism for integrating BCI through the javaagent JVM startup parameter. To illustrate why APM vendors choose BCI over class wrapping, he presents the following performance chart:
Whitehead concludes his series by reviewing monitoring strategies for the ecosystem in which a Java application resides, namely the operating system and host environment, which includes databases and messaging infrastructure. He discusses the challenges and benefits of agent and agentless monitoring and then dives deeply into monitoring Linux/Unix systems and Windows systems. The next challenge he addresses is database monitoring and contextual tracing. He describes JMS and messaging systems and illustrates how to monitor them through a combination of synthetic messages and JMX. At the end of part 3 he discusses visualization and reporting and presents sample screen shots of visualization techniques, including dash boarding.
In short, this article series presents and in depth overview of performance monitoring and includes a level of detail that allows the reader to understand many of the technologies that may be taken for granted in off-the-shelf monitoring solutions.
For more information on performance and scalability, see InfoQ's Performance and Scalability page.