How to Instrument Your Node.js App Without a Single Monkey-Patch

Why do we keep hacking the internal prototypes of our dependencies just to get a glimpse of a performance bottleneck?

If you’ve ever looked at the source code of a traditional Node.js APM (Application Performance Monitoring) agent, you probably saw something that looked like a crime scene. You’ll find lines of code that reach into the http module, grab the request method, wrap it in a custom function, and then stuff it back into the module. This is monkey-patching. It’s effective, but it’s also the equivalent of performing open-heart surgery with a butter knife while the patient is running a marathon.

Monkey-patching is inherently brittle. If the Node.js core team changes an internal property name, your instrumentation breaks. If two different libraries try to patch the same method, they might overwrite each other, leading to "lost" traces or, worse, cryptic stack overflows.

But there is a better way. For the last few versions, Node.js has been quietly maturing a feature called diagnostics_channel. It’s a formal, native pub/sub bus designed specifically for telemetry. It allows library authors to broadcast what they're doing and allows you to listen in without ever touching their source code or overriding their methods.

The Fragility of the Old Ways

Before we look at the solution, let’s talk about why the "old way" is so painful. When we monkey-patch, we’re essentially doing this:

const http = require('http');
const originalRequest = http.request;

http.request = function(options, callback) {
  const start = performance.now();
  console.log('HTTP Request started');
  
  const req = originalRequest.call(this, options, (res) => {
    res.on('end', () => {
      console.log(`Request took ${performance.now() - start}ms`);
    });
    if (callback) return callback(res);
  });
  
  return req;
};

This looks fine in a 10-line snippet. But what happens when http.request starts supporting a new argument type? Or what if the callback is optional? What if another library also patches http.request and doesn't preserve the this context correctly?

The result is a house of cards. You’re not just observing the system; you’re changing it. This is the observer effect at its most destructive.

Enter `diagnostics_channel`

The diagnostics_channel module provides a way to decouple the *producers* of telemetry from the *consumers*.

Imagine the undici (the modern HTTP client for Node) developers want to let you know when a request starts. Instead of telling you to wrap their functions, they simply publish a message to a named channel. If no one is listening, the overhead is near zero. If someone is listening, they get a standardized object containing the request details.

Here is the mental model:
1. The Producer: (A library or Node core) creates a channel and publishes data to it.
2. The Consumer: (Your app or an APM) subscribes to that channel and processes the data.

How to Listen to the Internal Bus

Let’s see how this works in practice. We don't need to install anything; this is built into Node.js.

const diagnostics_channel = require('node:diagnostics_channel');

// Subscribe to a specific channel
diagnostics_channel.subscribe('http.client.request.start', (message) => {
  const { request } = message;
  console.log(`Outgoing request to: ${request.host}${request.path}`);
});

That’s it. You didn't wrap a function. You didn't touch the http module's prototype. You simply told Node: "Whenever an HTTP client request starts, send that data here."

The power here is in the naming convention. Node.js core and major libraries like undici, mongodb, and pg are increasingly adopting these standardized channel names.

Connecting the Dots with `AsyncLocalStorage`

Telemetry is useless if it’s just a stream of disconnected events. If I see a database query took 500ms, I need to know which incoming HTTP request triggered that query.

In the old days, we used continuation-local-storage or the notoriously difficult async_hooks API. Today, we use AsyncLocalStorage (ALS). ALS allows us to store a "context" (like a Trace ID) that follows the execution flow across asynchronous boundaries.

Let’s build a minimal, non-invasive tracer that links incoming requests to outgoing HTTP calls.

const { AsyncLocalStorage } = require('node:async_hooks');
const dc = require('node:diagnostics_channel');
const http = require('node:http');
const { crypto } = require('node:crypto');

const tracerStorage = new AsyncLocalStorage();

// 1. Every time a request hits our server, we create a Trace ID
const server = http.createServer((req, res) => {
  const traceId = crypto.randomUUID();
  
  tracerStorage.run({ traceId }, () => {
    // Everything inside this block and its async children 
    // can access this traceId.
    
    // Simulate some logic that makes an external call
    http.get('http://google.com', () => {
      res.end('Done');
    });
  });
});

// 2. We use diagnostics_channel to "catch" the outgoing call
dc.subscribe('http.client.request.start', (message) => {
  const context = tracerStorage.getStore();
  if (context) {
    console.log(`[Trace ID: ${context.traceId}] Outgoing HTTP call started`);
    // We could even inject the header into the outgoing request here!
    message.request.setHeader('x-trace-id', context.traceId);
  }
});

server.listen(3000);

In this example, we’ve achieved something magical: we are tracking the flow of a request through our system and even propagating headers to downstream services, all without ever modifying the http library’s logic.

Why this is safer for Production

When you monkey-patch, you run the risk of introducing memory leaks. If you wrap a function and accidentally hold onto a reference to a large object in a closure, that object will never be garbage collected.

diagnostics_channel is designed with performance in mind. If there are no subscribers to a channel, the publish() call returns immediately. The data being published is usually a plain JavaScript object. There are no heavy stack trace captures or prototype chains to traverse.

Furthermore, it’s synchronous. When a library publishes to a channel, the subscribers are executed immediately in the same tick. This is crucial because it allows you to modify the data (like adding a header) before the underlying operation proceeds.

Implementation: A Real-World Database Logger

Let’s get more practical. Suppose you want to log every slow query in your application. Many modern database drivers (like pg for PostgreSQL) are starting to support diagnostics_channel.

Even if a library doesn't support it yet, you can add it to your own internal data layer as a way to "future-proof" your instrumentation.

// db-wrapper.js (A conceptual example)
const dc = require('node:diagnostics_channel');

const queryChannel = dc.channel('my-app.db.query');

async function runQuery(sql, params) {
  const startTime = performance.now();
  
  // Broadcast that a query is starting
  if (queryChannel.hasSubscribers) {
    queryChannel.publish({ sql, params, startTime });
  }

  const result = await db.execute(sql, params);
  
  return result;
}

Now, anywhere in your app—or even in a separate "telemetry" module—you can subscribe:

dc.subscribe('my-app.db.query', ({ sql, startTime }) => {
  const duration = performance.now() - startTime;
  if (duration > 100) {
    console.warn(`Slow Query [${duration.toFixed(2)}ms]: ${sql}`);
  }
});

By using channel.hasSubscribers, we ensure that we only do the work of creating the message object if someone actually cares about it. This is how you build high-performance systems.

The "Tracing State" Gotcha

There is one nuance you need to be aware of. diagnostics_channel messages are often stateless. A request.start event tells you something started, but how do you track it until it ends?

You can’t easily share a variable between the "start" subscriber and the "end" subscriber because they are separate function calls. The trick is to attach a small bit of state to the objects being passed through the channel if they are objects that persist (like the request object itself), or use a WeakMap.

const requestTimes = new WeakMap();

dc.subscribe('http.client.request.start', (message) => {
  requestTimes.set(message.request, performance.now());
});

dc.subscribe('http.client.response.finish', (message) => {
  const start = requestTimes.get(message.request);
  if (start) {
    const duration = performance.now() - start;
    console.log(`Request finished in ${duration}ms`);
    requestTimes.delete(message.request);
  }
});

Using a WeakMap is the professional way to handle this. It ensures that once the request object is garbage collected, our timing data is also cleaned up. No memory leaks, no mess.

What about OpenTelemetry?

You might be wondering: "Doesn't OpenTelemetry (OTel) do all this for me?"

Yes and no. For a long time, the OTel Node.js SDK relied heavily on monkey-patching (via a library called shimmer). However, the community is moving fast. The latest OTel instrumentations are being rewritten to use diagnostics_channel under the hood.

By understanding diagnostics_channel yourself, you can:
1. Instrument internal business logic that OTel doesn't know about.
2. Debug your instrumentation. If your traces are missing, you can subscribe to the channels manually to see if the data is even being emitted.
3. Reduce dependencies. If you only need simple logging or performance tracking, you might not need the massive overhead of the full OTel SDK.

The Ecosystem Shift

We are currently in a transition period. Not every library supports diagnostics_channel yet. But the tide is turning. Node.js core has already implemented it for http, https, net, udp, and process.

The undici library (the engine behind the global fetch in Node) is a first-class citizen in the diagnostics_channel world. If you use fetch, you are already generating these events.

Making the Switch

If you’re maintaining an internal library or a shared service at your company, stop exposing hooks and callbacks for logging. Instead, publish to a diagnostic channel. It’s a cleaner API. It says to your users: "Here is a stream of data about what I'm doing. Consume it if you want, ignore it if you don't."

Here is a quick checklist for moving away from brittle instrumentation:

1. Audit your `require`s. Look for anything that modifies prototype or wraps core modules.
2. Check for native channels. See if the libraries you use (like pg or redis) have added diagnostic channel support in their latest versions.
3. Use `AsyncLocalStorage` for context. Stop passing traceId as a function argument through ten layers of your stack.
4. Standardize your names. If you create your own channels, follow the namespace.object.event pattern (e.g., checkout-service.order.created).

Closing Thoughts

Node.js has grown up. We no longer need to rely on the "hacker" tactics of the early 2010s to understand what our apps are doing. By leveraging diagnostics_channel, we treat our telemetry as a first-class citizen—just as important as our business logic, but completely decoupled from it.

It’s stable, it’s fast, and it won't break when you upgrade your Node version. That’s a win for everyone, except maybe for the people who enjoy debugging "Maximum call stack size exceeded" errors at 2 AM.

How to Instrument Your Node.js App Without a Single Monkey-Patch

The Fragility of the Old Ways

Enter `diagnostics_channel`

How to Listen to the Internal Bus

Connecting the Dots with `AsyncLocalStorage`

Why this is safer for Production

Implementation: A Real-World Database Logger

The "Tracing State" Gotcha

What about OpenTelemetry?

The Ecosystem Shift

Making the Switch

Closing Thoughts

Related Articles

A Persistent Memory for the Node.js JIT

The Event Loop Starves in Silence: Why CPU Metrics Fail Your Node.js Scaling

3 Ways the Linux 'Completely Fair Scheduler' Is Sabotaging Your Node.js Throughput

Related Articles

A Persistent Memory for the Node.js JIT

The Event Loop Starves in Silence: Why CPU Metrics Fail Your Node.js Scaling

3 Ways the Linux 'Completely Fair Scheduler' Is Sabotaging Your Node.js Throughput

The Fragility of the Old Ways

Enter diagnostics_channel

How to Listen to the Internal Bus

Connecting the Dots with AsyncLocalStorage

Why this is safer for Production

Implementation: A Real-World Database Logger

The "Tracing State" Gotcha

What about OpenTelemetry?

The Ecosystem Shift

Making the Switch

Closing Thoughts

Related Articles

A Persistent Memory for the Node.js JIT

The Event Loop Starves in Silence: Why CPU Metrics Fail Your Node.js Scaling

3 Ways the Linux 'Completely Fair Scheduler' Is Sabotaging Your Node.js Throughput

Related Articles

A Persistent Memory for the Node.js JIT

The Event Loop Starves in Silence: Why CPU Metrics Fail Your Node.js Scaling

3 Ways the Linux 'Completely Fair Scheduler' Is Sabotaging Your Node.js Throughput

Enter `diagnostics_channel`

Connecting the Dots with `AsyncLocalStorage`