MCP for Monitoring & Alerts | Tutorial

TL;DR

Connect Sentry and Datadog to AI assistants via MCP servers for intelligent incident response. The AI queries error trends, fetches stack traces, checks infrastructure metrics, and suggests fixes — all through natural language. Build custom MCP tools that wrap monitoring APIs so the AI can triage alerts, identify root causes, and help resolve incidents faster than manual dashboard navigation.

AI-Powered Incident Response with MCP Monitoring Tools

When an alert fires at 2am, the last thing you want is to manually dig through dashboards and log files. MCP monitoring tools let AI assistants do the initial triage: query Sentry for error details, check Datadog for related metrics, read the relevant source code, and suggest a fix. This tutorial builds MCP tools that wrap Sentry and Datadog APIs, giving your AI assistant the same observability data your on-call engineers use.

Prerequisites

A Sentry account with API authentication token
A Datadog account with API and application keys (optional)
Node.js 18+ and npm installed
Claude Desktop or Cursor with MCP support

Step-by-step guide

Build Sentry MCP tools for error investigation

Create MCP tools that query the Sentry API for recent issues, error details, and stack traces. The Sentry API provides endpoints for listing project issues, getting event details, and fetching stack traces. Wrap these as MCP tools so the AI can investigate errors by project, time range, or error type.

typescript

1// src/sentry-tools.ts
2import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
3import { z } from "zod";
4
5const SENTRY_TOKEN = process.env.SENTRY_AUTH_TOKEN!;
6const SENTRY_ORG = process.env.SENTRY_ORG!;
7
8async function sentryApi(path: string) {
9  const res = await fetch(`https://sentry.io/api/0${path}`, {
10    headers: { Authorization: `Bearer ${SENTRY_TOKEN}` },
11  });
12  return res.json();
13}
14
15export function registerSentryTools(server: McpServer) {
16  server.tool("get_recent_errors", "Get recent error issues from Sentry", {
17    project: z.string().describe("Sentry project slug"),
18    hours: z.number().default(24).describe("Look back period in hours"),
19    limit: z.number().default(10),
20  }, async ({ project, hours, limit }) => {
21    const since = new Date(Date.now() - hours * 3600000).toISOString();
22    const issues = await sentryApi(
23      `/projects/${SENTRY_ORG}/${project}/issues/?query=is:unresolved&sort=freq&limit=${limit}&start=${since}`
24    );
25    const summary = issues.map((i: any) => ({
26      id: i.id, title: i.title, count: i.count, level: i.level,
27      firstSeen: i.firstSeen, lastSeen: i.lastSeen, culprit: i.culprit,
28    }));
29    return { content: [{ type: "text", text: JSON.stringify(summary, null, 2) }] };
30  });
31
32  server.tool("get_error_details", "Get full details and stack trace for a Sentry issue", {
33    issueId: z.string().describe("Sentry issue ID"),
34  }, async ({ issueId }) => {
35    const events = await sentryApi(`/issues/${issueId}/events/latest/`);
36    const trace = events.entries?.find((e: any) => e.type === "exception");
37    return { content: [{ type: "text", text: JSON.stringify({
38      message: events.message,
39      tags: events.tags,
40      stackTrace: trace?.data?.values?.map((v: any) => ({
41        type: v.type, value: v.value,
42        frames: v.stacktrace?.frames?.slice(-5).map((f: any) => ({
43          filename: f.filename, function: f.function, lineNo: f.lineNo, context: f.context,
44        })),
45      })),
46    }, null, 2) }] };
47  });
48}

Expected result: MCP tools that query Sentry for recent errors and detailed stack traces.

Add Datadog MCP tools for metrics and infrastructure monitoring

Build MCP tools that query Datadog for metrics, active alerts, and host information. The Datadog API provides time-series metrics, alert status, and infrastructure data. These tools help the AI correlate errors with infrastructure issues — like a Sentry spike matching a CPU spike on Datadog.

typescript

1// src/datadog-tools.ts
2import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
3import { z } from "zod";
4
5const DD_API_KEY = process.env.DD_API_KEY!;
6const DD_APP_KEY = process.env.DD_APP_KEY!;
7
8async function datadogApi(path: string, params?: Record<string, string>) {
9  const url = new URL(`https://api.datadoghq.com/api/v1${path}`);
10  if (params) Object.entries(params).forEach(([k, v]) => url.searchParams.set(k, v));
11  const res = await fetch(url, {
12    headers: { "DD-API-KEY": DD_API_KEY, "DD-APPLICATION-KEY": DD_APP_KEY },
13  });
14  return res.json();
15}
16
17export function registerDatadogTools(server: McpServer) {
18  server.tool("get_active_alerts", "Get currently active Datadog monitors/alerts", {
19    tag: z.string().optional().describe("Filter by tag, e.g. service:api"),
20  }, async ({ tag }) => {
21    const monitors = await datadogApi("/monitor", {
22      ...(tag ? { monitor_tags: tag } : {}),
23    });
24    const active = monitors
25      .filter((m: any) => m.overall_state === "Alert" || m.overall_state === "Warn")
26      .map((m: any) => ({ id: m.id, name: m.name, state: m.overall_state, message: m.message }));
27    return { content: [{ type: "text", text: JSON.stringify(active, null, 2) }] };
28  });
29
30  server.tool("query_metrics", "Query Datadog metrics for a time range", {
31    query: z.string().describe("Datadog metric query, e.g. avg:system.cpu.user{service:api}"),
32    hours: z.number().default(1).describe("Look back period in hours"),
33  }, async ({ query, hours }) => {
34    const now = Math.floor(Date.now() / 1000);
35    const from = now - hours * 3600;
36    const result = await datadogApi("/query", {
37      query, from: String(from), to: String(now),
38    });
39    const series = result.series?.map((s: any) => ({
40      metric: s.metric, scope: s.scope,
41      points: s.pointlist?.slice(-10).map((p: any) => ({
42        time: new Date(p[0]).toISOString(), value: p[1]?.toFixed(2),
43      })),
44    }));
45    return { content: [{ type: "text", text: JSON.stringify(series, null, 2) }] };
46  });
47}

Expected result: MCP tools that query Datadog for active alerts and time-series metrics.

Combine monitoring with code context for root cause analysis

The real power comes from combining monitoring data with code access. When the AI finds an error in Sentry, it can read the actual source file via a Filesystem MCP server, understand the code around the failing line, and suggest a fix. Configure all three servers (Sentry, Datadog, Filesystem) together for comprehensive incident investigation.

typescript

1// Complete MCP config with all monitoring + code servers:
2{
3  "mcpServers": {
4    "sentry": {
5      "command": "node",
6      "args": ["dist/monitoring-server.js"],
7      "env": {
8        "SENTRY_AUTH_TOKEN": "sntrys_...",
9        "SENTRY_ORG": "your-org"
10      }
11    },
12    "datadog": {
13      "command": "node",
14      "args": ["dist/datadog-server.js"],
15      "env": {
16        "DD_API_KEY": "your-dd-api-key",
17        "DD_APP_KEY": "your-dd-app-key"
18      }
19    },
20    "codebase": {
21      "command": "npx",
22      "args": ["-y", "@modelcontextprotocol/server-filesystem", "/path/to/repo"]
23    }
24  }
25}
26
27// Example incident investigation prompt:
28// "We got paged for high error rates. Check Sentry for recent errors in the
29//  api-gateway project, look at the stack traces, read the relevant source
30//  files, and check Datadog for any infrastructure anomalies. Give me a
31//  root cause analysis and suggested fix."

Expected result: AI investigates incidents by querying errors, reading code, checking metrics, and suggesting fixes.

Build automated alert triage workflows

Create a tool that automates the initial triage of monitoring alerts. When called, it queries recent errors from Sentry, checks related metrics in Datadog, categorizes the severity, and returns a structured incident summary. This can be triggered by a webhook when an alert fires, providing instant AI triage before a human even looks at the alert.

typescript

1server.tool(
2  "triage_incident",
3  "Automated incident triage: checks errors and metrics, returns severity assessment",
4  {
5    project: z.string().describe("Sentry project slug"),
6    service: z.string().describe("Service tag for Datadog metrics"),
7  },
8  async ({ project, service }) => {
9    // Fetch recent errors
10    const issues = await sentryApi(
11      `/projects/${SENTRY_ORG}/${project}/issues/?query=is:unresolved&sort=date&limit=5`
12    );
13
14    // Fetch active alerts
15    const monitors = await datadogApi("/monitor", { monitor_tags: `service:${service}` });
16    const activeAlerts = monitors.filter((m: any) =>
17      m.overall_state === "Alert" || m.overall_state === "Warn"
18    );
19
20    // Assess severity
21    const errorCount = issues.reduce((sum: number, i: any) => sum + i.count, 0);
22    const severity = activeAlerts.length > 2 || errorCount > 100 ? "CRITICAL"
23      : activeAlerts.length > 0 || errorCount > 10 ? "HIGH" : "LOW";
24
25    const triage = {
26      severity,
27      timestamp: new Date().toISOString(),
28      errorSummary: {
29        totalErrors: errorCount,
30        topIssues: issues.slice(0, 3).map((i: any) => ({ title: i.title, count: i.count })),
31      },
32      activeAlerts: activeAlerts.map((m: any) => ({ name: m.name, state: m.overall_state })),
33      recommendation: severity === "CRITICAL"
34        ? "Immediate investigation required. Multiple alerts active."
35        : severity === "HIGH"
36        ? "Investigate soon. Error rates elevated."
37        : "Monitor. Low error rates, no active alerts.",
38    };
39
40    return { content: [{ type: "text", text: JSON.stringify(triage, null, 2) }] };
41  }
42);

Expected result: An automated triage tool that assesses incident severity from monitoring data.

Complete working example

src/monitoring-server.ts

1import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
2import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";
3import { z } from "zod";
4
5const SENTRY_TOKEN = process.env.SENTRY_AUTH_TOKEN!;
6const SENTRY_ORG = process.env.SENTRY_ORG!;
7
8async function sentry(path: string) {
9  const r = await fetch(`https://sentry.io/api/0${path}`, {
10    headers: { Authorization: `Bearer ${SENTRY_TOKEN}` },
11  });
12  return r.json();
13}
14
15const server = new McpServer({ name: "monitoring", version: "1.0.0" });
16
17server.tool("recent_errors", "Get recent unresolved errors from Sentry", {
18  project: z.string(), hours: z.number().default(24), limit: z.number().default(10),
19}, async ({ project, hours, limit }) => {
20  const issues = await sentry(`/projects/${SENTRY_ORG}/${project}/issues/?query=is:unresolved&sort=freq&limit=${limit}`);
21  const data = issues.map((i: any) => ({
22    id: i.id, title: i.title, count: i.count, level: i.level, lastSeen: i.lastSeen,
23  }));
24  return { content: [{ type: "text", text: JSON.stringify(data, null, 2) }] };
25});
26
27server.tool("error_details", "Get stack trace for a Sentry issue", {
28  issueId: z.string(),
29}, async ({ issueId }) => {
30  const event = await sentry(`/issues/${issueId}/events/latest/`);
31  const exc = event.entries?.find((e: any) => e.type === "exception");
32  const frames = exc?.data?.values?.[0]?.stacktrace?.frames?.slice(-5) || [];
33  return { content: [{ type: "text", text: JSON.stringify({
34    message: event.message,
35    frames: frames.map((f: any) => ({ file: f.filename, fn: f.function, line: f.lineNo })),
36  }, null, 2) }] };
37});
38
39server.tool("triage", "Assess incident severity", {
40  project: z.string(),
41}, async ({ project }) => {
42  const issues = await sentry(`/projects/${SENTRY_ORG}/${project}/issues/?query=is:unresolved&sort=date&limit=10`);
43  const total = issues.reduce((s: number, i: any) => s + (i.count || 0), 0);
44  const severity = total > 100 ? "CRITICAL" : total > 10 ? "HIGH" : "LOW";
45  return { content: [{ type: "text", text: JSON.stringify({
46    severity, errorCount: total,
47    topIssues: issues.slice(0, 3).map((i: any) => `${i.title} (${i.count}x)`),
48  }, null, 2) }] };
49});
50
51async function main() {
52  await server.connect(new StdioServerTransport());
53  console.error("Monitoring MCP server running");
54}
55main().catch(e => { console.error(e); process.exit(1); });

Common mistakes when using MCP for monitoring and alerting

Why it's a problem: Exposing monitoring API keys that have write access (resolve issues, silence alerts)

How to avoid: Create read-only API tokens for monitoring services. The AI should investigate, not modify alert states.

Why it's a problem: Returning entire stack traces with hundreds of frames, overwhelming the AI context

How to avoid: Limit stack trace output to the 5 most recent frames per exception. These contain the relevant application code.

Why it's a problem: Not including timestamps in monitoring data, making it hard for the AI to correlate events

How to avoid: Always include ISO timestamps in error and metric data so the AI can identify time-based patterns.

Best practices

Use read-only API tokens for all monitoring service integrations
Limit stack traces to the 5 most recent frames to keep responses concise
Include timestamps in all monitoring data for temporal correlation
Combine error monitoring (Sentry) with infrastructure monitoring (Datadog) for full context
Add a triage tool that automatically assesses incident severity
Pair monitoring tools with Filesystem tools so the AI can read the actual failing code
Log all monitoring API calls to stderr for debugging and audit trails

Still stuck?

Copy one of these prompts to get a personalized, step-by-step explanation.

ChatGPT Prompt

Build an MCP server for incident monitoring. Include tools to query Sentry for recent errors and stack traces, query Datadog for active alerts and metrics, and an automated triage tool that assesses severity. Use TypeScript.

MCP Prompt

Create monitoring MCP tools that wrap Sentry and Datadog APIs. Include recent_errors, error_details, get_active_alerts, query_metrics, and triage_incident tools. Return structured data optimized for AI interpretation.

Frequently asked questions

Can the AI resolve Sentry issues or silence Datadog alerts?

Only if you give the API token write permissions. For safety, start with read-only tokens so the AI can investigate but not modify monitoring state.

How do I handle rate limits from Sentry and Datadog APIs?

Add rate limiting to your MCP tools (see the rate limiting tutorial). Sentry allows 100 requests/minute, Datadog allows 300/minute for metrics queries.

Can this work with PagerDuty or OpsGenie?

Yes. Build custom MCP tools that wrap the PagerDuty or OpsGenie REST API. The pattern is the same — query incidents, get details, and return structured data.

Can RapidDev set up monitoring MCP integrations?

Yes. RapidDev builds turnkey monitoring MCP setups that combine Sentry, Datadog, PagerDuty, and code access for AI-powered incident response.

How do I trigger AI triage automatically when an alert fires?

Use a webhook from your monitoring service to trigger a script that connects to the MCP server and calls the triage tool. The script can post the triage results to Slack or PagerDuty.

Talk to an Expert

Our team has built 600+ apps. Get personalized help with your project.

Book a free consultation

How to use MCP for monitoring and alerting

What you'll learn

What you'll learn

AI-Powered Incident Response with MCP Monitoring Tools

Prerequisites

Step-by-step guide

Build Sentry MCP tools for error investigation

Build Sentry MCP tools for error investigation

Add Datadog MCP tools for metrics and infrastructure monitoring

Add Datadog MCP tools for metrics and infrastructure monitoring

Combine monitoring with code context for root cause analysis

Combine monitoring with code context for root cause analysis

Build automated alert triage workflows

Build automated alert triage workflows

Complete working example

Common mistakes when using MCP for monitoring and alerting

Best practices

Still stuck?

Frequently asked questions

Talk to an Expert

Need help with your project?

We put the rapid in RapidDev

How to use MCP for monitoring and alerting

What you'll learn

AI-Powered Incident Response with MCP Monitoring Tools

Prerequisites

Step-by-step guide

Build Sentry MCP tools for error investigation

Build Sentry MCP tools for error investigation

Add Datadog MCP tools for metrics and infrastructure monitoring

Add Datadog MCP tools for metrics and infrastructure monitoring

Combine monitoring with code context for root cause analysis

Combine monitoring with code context for root cause analysis

Build automated alert triage workflows

Build automated alert triage workflows

Complete working example

Common mistakes when using MCP for monitoring and alerting

Best practices

Still stuck?

Related tutorials

How to Use MCP for Project Management

How to Add Logging and Monitoring to MCP

How to Chain Multiple MCP Tools

Frequently asked questions

Talk to an Expert

Need help with your project?

We put the rapid in RapidDev