Connect Sentry and Datadog to AI assistants via MCP servers for intelligent incident response. The AI queries error trends, fetches stack traces, checks infrastructure metrics, and suggests fixes — all through natural language. Build custom MCP tools that wrap monitoring APIs so the AI can triage alerts, identify root causes, and help resolve incidents faster than manual dashboard navigation.
AI-Powered Incident Response with MCP Monitoring Tools
When an alert fires at 2am, the last thing you want is to manually dig through dashboards and log files. MCP monitoring tools let AI assistants do the initial triage: query Sentry for error details, check Datadog for related metrics, read the relevant source code, and suggest a fix. This tutorial builds MCP tools that wrap Sentry and Datadog APIs, giving your AI assistant the same observability data your on-call engineers use.
Prerequisites
- A Sentry account with API authentication token
- A Datadog account with API and application keys (optional)
- Node.js 18+ and npm installed
- Claude Desktop or Cursor with MCP support
Step-by-step guide
Build Sentry MCP tools for error investigation
Build Sentry MCP tools for error investigation
Create MCP tools that query the Sentry API for recent issues, error details, and stack traces. The Sentry API provides endpoints for listing project issues, getting event details, and fetching stack traces. Wrap these as MCP tools so the AI can investigate errors by project, time range, or error type.
1// src/sentry-tools.ts2import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";3import { z } from "zod";45const SENTRY_TOKEN = process.env.SENTRY_AUTH_TOKEN!;6const SENTRY_ORG = process.env.SENTRY_ORG!;78async function sentryApi(path: string) {9 const res = await fetch(`https://sentry.io/api/0${path}`, {10 headers: { Authorization: `Bearer ${SENTRY_TOKEN}` },11 });12 return res.json();13}1415export function registerSentryTools(server: McpServer) {16 server.tool("get_recent_errors", "Get recent error issues from Sentry", {17 project: z.string().describe("Sentry project slug"),18 hours: z.number().default(24).describe("Look back period in hours"),19 limit: z.number().default(10),20 }, async ({ project, hours, limit }) => {21 const since = new Date(Date.now() - hours * 3600000).toISOString();22 const issues = await sentryApi(23 `/projects/${SENTRY_ORG}/${project}/issues/?query=is:unresolved&sort=freq&limit=${limit}&start=${since}`24 );25 const summary = issues.map((i: any) => ({26 id: i.id, title: i.title, count: i.count, level: i.level,27 firstSeen: i.firstSeen, lastSeen: i.lastSeen, culprit: i.culprit,28 }));29 return { content: [{ type: "text", text: JSON.stringify(summary, null, 2) }] };30 });3132 server.tool("get_error_details", "Get full details and stack trace for a Sentry issue", {33 issueId: z.string().describe("Sentry issue ID"),34 }, async ({ issueId }) => {35 const events = await sentryApi(`/issues/${issueId}/events/latest/`);36 const trace = events.entries?.find((e: any) => e.type === "exception");37 return { content: [{ type: "text", text: JSON.stringify({38 message: events.message,39 tags: events.tags,40 stackTrace: trace?.data?.values?.map((v: any) => ({41 type: v.type, value: v.value,42 frames: v.stacktrace?.frames?.slice(-5).map((f: any) => ({43 filename: f.filename, function: f.function, lineNo: f.lineNo, context: f.context,44 })),45 })),46 }, null, 2) }] };47 });48}Expected result: MCP tools that query Sentry for recent errors and detailed stack traces.
Add Datadog MCP tools for metrics and infrastructure monitoring
Add Datadog MCP tools for metrics and infrastructure monitoring
Build MCP tools that query Datadog for metrics, active alerts, and host information. The Datadog API provides time-series metrics, alert status, and infrastructure data. These tools help the AI correlate errors with infrastructure issues — like a Sentry spike matching a CPU spike on Datadog.
1// src/datadog-tools.ts2import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";3import { z } from "zod";45const DD_API_KEY = process.env.DD_API_KEY!;6const DD_APP_KEY = process.env.DD_APP_KEY!;78async function datadogApi(path: string, params?: Record<string, string>) {9 const url = new URL(`https://api.datadoghq.com/api/v1${path}`);10 if (params) Object.entries(params).forEach(([k, v]) => url.searchParams.set(k, v));11 const res = await fetch(url, {12 headers: { "DD-API-KEY": DD_API_KEY, "DD-APPLICATION-KEY": DD_APP_KEY },13 });14 return res.json();15}1617export function registerDatadogTools(server: McpServer) {18 server.tool("get_active_alerts", "Get currently active Datadog monitors/alerts", {19 tag: z.string().optional().describe("Filter by tag, e.g. service:api"),20 }, async ({ tag }) => {21 const monitors = await datadogApi("/monitor", {22 ...(tag ? { monitor_tags: tag } : {}),23 });24 const active = monitors25 .filter((m: any) => m.overall_state === "Alert" || m.overall_state === "Warn")26 .map((m: any) => ({ id: m.id, name: m.name, state: m.overall_state, message: m.message }));27 return { content: [{ type: "text", text: JSON.stringify(active, null, 2) }] };28 });2930 server.tool("query_metrics", "Query Datadog metrics for a time range", {31 query: z.string().describe("Datadog metric query, e.g. avg:system.cpu.user{service:api}"),32 hours: z.number().default(1).describe("Look back period in hours"),33 }, async ({ query, hours }) => {34 const now = Math.floor(Date.now() / 1000);35 const from = now - hours * 3600;36 const result = await datadogApi("/query", {37 query, from: String(from), to: String(now),38 });39 const series = result.series?.map((s: any) => ({40 metric: s.metric, scope: s.scope,41 points: s.pointlist?.slice(-10).map((p: any) => ({42 time: new Date(p[0]).toISOString(), value: p[1]?.toFixed(2),43 })),44 }));45 return { content: [{ type: "text", text: JSON.stringify(series, null, 2) }] };46 });47}Expected result: MCP tools that query Datadog for active alerts and time-series metrics.
Combine monitoring with code context for root cause analysis
Combine monitoring with code context for root cause analysis
The real power comes from combining monitoring data with code access. When the AI finds an error in Sentry, it can read the actual source file via a Filesystem MCP server, understand the code around the failing line, and suggest a fix. Configure all three servers (Sentry, Datadog, Filesystem) together for comprehensive incident investigation.
1// Complete MCP config with all monitoring + code servers:2{3 "mcpServers": {4 "sentry": {5 "command": "node",6 "args": ["dist/monitoring-server.js"],7 "env": {8 "SENTRY_AUTH_TOKEN": "sntrys_...",9 "SENTRY_ORG": "your-org"10 }11 },12 "datadog": {13 "command": "node",14 "args": ["dist/datadog-server.js"],15 "env": {16 "DD_API_KEY": "your-dd-api-key",17 "DD_APP_KEY": "your-dd-app-key"18 }19 },20 "codebase": {21 "command": "npx",22 "args": ["-y", "@modelcontextprotocol/server-filesystem", "/path/to/repo"]23 }24 }25}2627// Example incident investigation prompt:28// "We got paged for high error rates. Check Sentry for recent errors in the29// api-gateway project, look at the stack traces, read the relevant source30// files, and check Datadog for any infrastructure anomalies. Give me a31// root cause analysis and suggested fix."Expected result: AI investigates incidents by querying errors, reading code, checking metrics, and suggesting fixes.
Build automated alert triage workflows
Build automated alert triage workflows
Create a tool that automates the initial triage of monitoring alerts. When called, it queries recent errors from Sentry, checks related metrics in Datadog, categorizes the severity, and returns a structured incident summary. This can be triggered by a webhook when an alert fires, providing instant AI triage before a human even looks at the alert.
1server.tool(2 "triage_incident",3 "Automated incident triage: checks errors and metrics, returns severity assessment",4 {5 project: z.string().describe("Sentry project slug"),6 service: z.string().describe("Service tag for Datadog metrics"),7 },8 async ({ project, service }) => {9 // Fetch recent errors10 const issues = await sentryApi(11 `/projects/${SENTRY_ORG}/${project}/issues/?query=is:unresolved&sort=date&limit=5`12 );1314 // Fetch active alerts15 const monitors = await datadogApi("/monitor", { monitor_tags: `service:${service}` });16 const activeAlerts = monitors.filter((m: any) =>17 m.overall_state === "Alert" || m.overall_state === "Warn"18 );1920 // Assess severity21 const errorCount = issues.reduce((sum: number, i: any) => sum + i.count, 0);22 const severity = activeAlerts.length > 2 || errorCount > 100 ? "CRITICAL"23 : activeAlerts.length > 0 || errorCount > 10 ? "HIGH" : "LOW";2425 const triage = {26 severity,27 timestamp: new Date().toISOString(),28 errorSummary: {29 totalErrors: errorCount,30 topIssues: issues.slice(0, 3).map((i: any) => ({ title: i.title, count: i.count })),31 },32 activeAlerts: activeAlerts.map((m: any) => ({ name: m.name, state: m.overall_state })),33 recommendation: severity === "CRITICAL"34 ? "Immediate investigation required. Multiple alerts active."35 : severity === "HIGH"36 ? "Investigate soon. Error rates elevated."37 : "Monitor. Low error rates, no active alerts.",38 };3940 return { content: [{ type: "text", text: JSON.stringify(triage, null, 2) }] };41 }42);Expected result: An automated triage tool that assesses incident severity from monitoring data.
Complete working example
1import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";2import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";3import { z } from "zod";45const SENTRY_TOKEN = process.env.SENTRY_AUTH_TOKEN!;6const SENTRY_ORG = process.env.SENTRY_ORG!;78async function sentry(path: string) {9 const r = await fetch(`https://sentry.io/api/0${path}`, {10 headers: { Authorization: `Bearer ${SENTRY_TOKEN}` },11 });12 return r.json();13}1415const server = new McpServer({ name: "monitoring", version: "1.0.0" });1617server.tool("recent_errors", "Get recent unresolved errors from Sentry", {18 project: z.string(), hours: z.number().default(24), limit: z.number().default(10),19}, async ({ project, hours, limit }) => {20 const issues = await sentry(`/projects/${SENTRY_ORG}/${project}/issues/?query=is:unresolved&sort=freq&limit=${limit}`);21 const data = issues.map((i: any) => ({22 id: i.id, title: i.title, count: i.count, level: i.level, lastSeen: i.lastSeen,23 }));24 return { content: [{ type: "text", text: JSON.stringify(data, null, 2) }] };25});2627server.tool("error_details", "Get stack trace for a Sentry issue", {28 issueId: z.string(),29}, async ({ issueId }) => {30 const event = await sentry(`/issues/${issueId}/events/latest/`);31 const exc = event.entries?.find((e: any) => e.type === "exception");32 const frames = exc?.data?.values?.[0]?.stacktrace?.frames?.slice(-5) || [];33 return { content: [{ type: "text", text: JSON.stringify({34 message: event.message,35 frames: frames.map((f: any) => ({ file: f.filename, fn: f.function, line: f.lineNo })),36 }, null, 2) }] };37});3839server.tool("triage", "Assess incident severity", {40 project: z.string(),41}, async ({ project }) => {42 const issues = await sentry(`/projects/${SENTRY_ORG}/${project}/issues/?query=is:unresolved&sort=date&limit=10`);43 const total = issues.reduce((s: number, i: any) => s + (i.count || 0), 0);44 const severity = total > 100 ? "CRITICAL" : total > 10 ? "HIGH" : "LOW";45 return { content: [{ type: "text", text: JSON.stringify({46 severity, errorCount: total,47 topIssues: issues.slice(0, 3).map((i: any) => `${i.title} (${i.count}x)`),48 }, null, 2) }] };49});5051async function main() {52 await server.connect(new StdioServerTransport());53 console.error("Monitoring MCP server running");54}55main().catch(e => { console.error(e); process.exit(1); });Common mistakes when using MCP for monitoring and alerting
Why it's a problem: Exposing monitoring API keys that have write access (resolve issues, silence alerts)
How to avoid: Create read-only API tokens for monitoring services. The AI should investigate, not modify alert states.
Why it's a problem: Returning entire stack traces with hundreds of frames, overwhelming the AI context
How to avoid: Limit stack trace output to the 5 most recent frames per exception. These contain the relevant application code.
Why it's a problem: Not including timestamps in monitoring data, making it hard for the AI to correlate events
How to avoid: Always include ISO timestamps in error and metric data so the AI can identify time-based patterns.
Best practices
- Use read-only API tokens for all monitoring service integrations
- Limit stack traces to the 5 most recent frames to keep responses concise
- Include timestamps in all monitoring data for temporal correlation
- Combine error monitoring (Sentry) with infrastructure monitoring (Datadog) for full context
- Add a triage tool that automatically assesses incident severity
- Pair monitoring tools with Filesystem tools so the AI can read the actual failing code
- Log all monitoring API calls to stderr for debugging and audit trails
Still stuck?
Copy one of these prompts to get a personalized, step-by-step explanation.
Build an MCP server for incident monitoring. Include tools to query Sentry for recent errors and stack traces, query Datadog for active alerts and metrics, and an automated triage tool that assesses severity. Use TypeScript.
Create monitoring MCP tools that wrap Sentry and Datadog APIs. Include recent_errors, error_details, get_active_alerts, query_metrics, and triage_incident tools. Return structured data optimized for AI interpretation.
Frequently asked questions
Can the AI resolve Sentry issues or silence Datadog alerts?
Only if you give the API token write permissions. For safety, start with read-only tokens so the AI can investigate but not modify monitoring state.
How do I handle rate limits from Sentry and Datadog APIs?
Add rate limiting to your MCP tools (see the rate limiting tutorial). Sentry allows 100 requests/minute, Datadog allows 300/minute for metrics queries.
Can this work with PagerDuty or OpsGenie?
Yes. Build custom MCP tools that wrap the PagerDuty or OpsGenie REST API. The pattern is the same — query incidents, get details, and return structured data.
Can RapidDev set up monitoring MCP integrations?
Yes. RapidDev builds turnkey monitoring MCP setups that combine Sentry, Datadog, PagerDuty, and code access for AI-powered incident response.
How do I trigger AI triage automatically when an alert fires?
Use a webhook from your monitoring service to trigger a script that connects to the MCP server and calls the triage tool. The script can post the triage results to Slack or PagerDuty.
Talk to an Expert
Our team has built 600+ apps. Get personalized help with your project.
Book a free consultation