Skip to main content
RapidDev - Software Development Agency
mcp-tutorial

How to use MCP for data analysis with AI

Use MCP to give AI assistants access to databases and filesystems for data analysis. The AI queries your database, processes results, generates charts descriptions, and writes reports — all through natural language. Combine a Database MCP server for SQL queries with a Filesystem server for reading CSV files and writing output, enabling AI-powered analysis without exporting data to external tools.

What you'll learn

  • How to combine Database and Filesystem MCP servers for data analysis
  • How to ask natural language questions about data and get SQL-backed answers
  • How to build analysis-specific MCP tools with aggregation helpers
  • How to generate CSV exports and summary reports through MCP
Book a free consultation
4.9Clutch rating
600+Happy partners
17+Countries served
190+Team members
Intermediate9 min read20-30 minMCP TypeScript SDK v1.x, PostgreSQL or SQLite, Claude Desktop / CursorMarch 2026RapidDev Engineering Team
TL;DR

Use MCP to give AI assistants access to databases and filesystems for data analysis. The AI queries your database, processes results, generates charts descriptions, and writes reports — all through natural language. Combine a Database MCP server for SQL queries with a Filesystem server for reading CSV files and writing output, enabling AI-powered analysis without exporting data to external tools.

AI-Powered Data Analysis with MCP Database and File Tools

Data analysis usually means exporting data to a spreadsheet, writing SQL queries, or learning a BI tool. With MCP, you simply ask questions and the AI does the analysis: it queries the database, processes the results, and presents findings. This tutorial shows how to set up database and filesystem MCP servers for analysis workflows, and how to build custom analysis tools that generate summaries, aggregations, and CSV exports.

Prerequisites

  • A PostgreSQL or SQLite database with data to analyze
  • Claude Desktop or Cursor with MCP support
  • Database connection string or file path
  • Node.js 18+ for custom server development (optional)

Step-by-step guide

1

Set up Database and Filesystem MCP servers together

Configure both a database server (for SQL queries) and a filesystem server (for reading data files and writing reports). The database server lets the AI query structured data, while the filesystem server lets it read CSV imports and write analysis outputs. Together, they give the AI a complete data analysis environment.

typescript
1// Claude Desktop config:
2{
3 "mcpServers": {
4 "database": {
5 "command": "npx",
6 "args": ["-y", "@modelcontextprotocol/server-postgres",
7 "postgresql://analyst:readonly@localhost:5432/analytics"]
8 },
9 "filesystem": {
10 "command": "npx",
11 "args": ["-y", "@modelcontextprotocol/server-filesystem",
12 "/Users/you/analysis-workspace"]
13 }
14 }
15}
16
17// Example analysis questions:
18// "What's our monthly revenue trend for the past 12 months?"
19// "Compare conversion rates between mobile and desktop users"
20// "Find the top 10 customers by lifetime value and their order patterns"
21// "Read the Q4-data.csv file and compare it with our database records"

Expected result: Both servers connected, enabling SQL queries and file read/write for analysis.

2

Ask natural language analysis questions

With the MCP servers connected, ask data analysis questions in natural language. The AI translates your questions into SQL, queries the database, processes results, and presents findings. It can handle multi-step analysis: first exploring the schema, then running queries, then synthesizing insights across multiple result sets.

typescript
1// Analysis workflow examples:
2
3// Revenue analysis:
4// "What's our total revenue by month for 2025? Show the month-over-month
5// growth rate and identify any months with significant drops."
6
7// The AI generates:
8// SELECT DATE_TRUNC('month', created_at) as month,
9// SUM(total) as revenue
10// FROM orders WHERE created_at >= '2025-01-01'
11// GROUP BY 1 ORDER BY 1;
12
13// User cohort analysis:
14// "Break down our users by signup month and show their 30/60/90 day
15// retention rates. Which cohort has the best retention?"
16
17// Funnel analysis:
18// "How many users go from signup to first purchase, and what's the
19// average time between signup and first order?"
20
21// File + database:
22// "Read the marketing-spend.csv file, then compare the monthly spend
23// with our monthly revenue to calculate ROI per channel."

Expected result: The AI generates SQL queries, executes them, and presents analysis with insights.

3

Build custom analysis MCP tools for common patterns

For analysis patterns you use frequently, build custom MCP tools that encapsulate the SQL and processing logic. Tools like revenue_by_period, user_retention, and funnel_analysis take high-level parameters and return pre-formatted results. This is faster and more reliable than having the AI generate SQL from scratch each time.

typescript
1// src/analysis-tools.ts
2import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
3import { z } from "zod";
4import pg from "pg";
5
6const pool = new pg.Pool({ connectionString: process.env.DATABASE_URL });
7
8export function registerAnalysisTools(server: McpServer) {
9 server.tool(
10 "revenue_by_period",
11 "Analyze revenue grouped by time period with growth rates",
12 {
13 period: z.enum(["day", "week", "month", "quarter"]).default("month"),
14 startDate: z.string().describe("Start date in YYYY-MM-DD format"),
15 endDate: z.string().optional().describe("End date, defaults to today"),
16 },
17 async ({ period, startDate, endDate }) => {
18 const end = endDate || new Date().toISOString().split("T")[0];
19 const result = await pool.query(`
20 WITH periods AS (
21 SELECT DATE_TRUNC($1, created_at) AS period,
22 SUM(total) AS revenue,
23 COUNT(*) AS orders
24 FROM orders
25 WHERE created_at BETWEEN $2 AND $3
26 GROUP BY 1 ORDER BY 1
27 )
28 SELECT period, revenue, orders,
29 ROUND((revenue - LAG(revenue) OVER (ORDER BY period)) /
30 NULLIF(LAG(revenue) OVER (ORDER BY period), 0) * 100, 1) AS growth_pct
31 FROM periods
32 `, [period, startDate, end]);
33 return { content: [{ type: "text", text: JSON.stringify(result.rows, null, 2) }] };
34 }
35 );
36
37 server.tool(
38 "export_csv",
39 "Run a SQL query and save results as a CSV file",
40 {
41 sql: z.string().describe("SELECT query to export"),
42 filename: z.string().describe("Output filename like report.csv"),
43 },
44 async ({ sql, filename }) => {
45 if (/\b(DELETE|DROP|UPDATE|INSERT)\b/i.test(sql)) {
46 return { content: [{ type: "text", text: "Error: Only SELECT queries allowed" }], isError: true };
47 }
48 const result = await pool.query(sql);
49 const headers = result.fields.map(f => f.name).join(",");
50 const rows = result.rows.map(r =>
51 result.fields.map(f => JSON.stringify(r[f.name] ?? "")).join(",")
52 );
53 const csv = [headers, ...rows].join("\n");
54 const fs = await import("fs/promises");
55 const outPath = `/tmp/analysis/${filename}`;
56 await fs.mkdir("/tmp/analysis", { recursive: true });
57 await fs.writeFile(outPath, csv, "utf-8");
58 return { content: [{ type: "text", text: `Exported ${result.rowCount} rows to ${outPath}` }] };
59 }
60 );
61}

Expected result: Custom analysis tools that handle common patterns like revenue analysis and CSV exports.

4

Generate automated analysis reports

Build a report generation tool that runs multiple queries, formats the results into a readable report, and saves it to a file. The AI can then read and summarize the report. This is useful for daily or weekly automated analysis. For organizations that need advanced analytics pipelines, RapidDev builds custom MCP data analysis platforms that integrate with BI tools and data warehouses.

typescript
1server.tool(
2 "generate_weekly_report",
3 "Generate a weekly business metrics report",
4 {
5 weekStart: z.string().describe("Monday date in YYYY-MM-DD format"),
6 },
7 async ({ weekStart }) => {
8 const weekEnd = new Date(new Date(weekStart).getTime() + 7 * 86400000)
9 .toISOString().split("T")[0];
10
11 const revenue = await pool.query(
12 `SELECT SUM(total) as total, COUNT(*) as orders FROM orders
13 WHERE created_at BETWEEN $1 AND $2`, [weekStart, weekEnd]
14 );
15 const newUsers = await pool.query(
16 `SELECT COUNT(*) as count FROM users
17 WHERE created_at BETWEEN $1 AND $2`, [weekStart, weekEnd]
18 );
19 const topProducts = await pool.query(
20 `SELECT p.name, SUM(oi.quantity) as units, SUM(oi.total) as revenue
21 FROM order_items oi JOIN products p ON oi.product_id = p.id
22 JOIN orders o ON oi.order_id = o.id
23 WHERE o.created_at BETWEEN $1 AND $2
24 GROUP BY p.id ORDER BY revenue DESC LIMIT 5`, [weekStart, weekEnd]
25 );
26
27 const report = {
28 period: `${weekStart} to ${weekEnd}`,
29 revenue: revenue.rows[0],
30 newUsers: newUsers.rows[0].count,
31 topProducts: topProducts.rows,
32 };
33
34 return { content: [{ type: "text", text: JSON.stringify(report, null, 2) }] };
35 }
36);

Expected result: A report generation tool that runs multiple queries and returns a structured weekly summary.

Complete working example

src/analysis-server.ts
1import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
2import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";
3import { z } from "zod";
4import pg from "pg";
5
6const pool = new pg.Pool({ connectionString: process.env.DATABASE_URL, max: 3 });
7const BLOCKED = /\b(DELETE|DROP|TRUNCATE|UPDATE|INSERT|ALTER|CREATE)\b/i;
8
9const server = new McpServer({ name: "data-analysis", version: "1.0.0" });
10
11server.tool("schema", "Get database schema", {}, async () => {
12 const r = await pool.query(
13 `SELECT table_name, column_name, data_type FROM information_schema.columns
14 WHERE table_schema='public' ORDER BY table_name, ordinal_position`
15 );
16 return { content: [{ type: "text", text: JSON.stringify(r.rows, null, 2) }] };
17});
18
19server.tool("query", "Run a read-only SQL query", {
20 sql: z.string(), limit: z.number().default(100),
21}, async ({ sql, limit }) => {
22 if (BLOCKED.test(sql)) return { content: [{ type: "text", text: "Only SELECT allowed" }], isError: true };
23 const q = sql.toLowerCase().includes("limit") ? sql : `${sql} LIMIT ${limit}`;
24 const r = await pool.query({ text: q, timeout: 15000 });
25 return { content: [{ type: "text", text: JSON.stringify({
26 rows: r.rowCount, columns: r.fields.map(f => f.name), data: r.rows,
27 }, null, 2) }] };
28});
29
30server.tool("revenue_trend", "Revenue by period with growth rates", {
31 period: z.enum(["day", "week", "month"]).default("month"),
32 months: z.number().default(12),
33}, async ({ period, months }) => {
34 const r = await pool.query(`
35 WITH p AS (
36 SELECT DATE_TRUNC($1, created_at) AS t, SUM(total) AS rev, COUNT(*) AS orders
37 FROM orders WHERE created_at >= NOW() - ($2 || ' months')::interval
38 GROUP BY 1 ORDER BY 1
39 ) SELECT t, rev, orders,
40 ROUND((rev - LAG(rev) OVER(ORDER BY t)) / NULLIF(LAG(rev) OVER(ORDER BY t), 0) * 100, 1) AS growth
41 FROM p`, [period, months]);
42 return { content: [{ type: "text", text: JSON.stringify(r.rows, null, 2) }] };
43});
44
45server.tool("export_csv", "Export query results to CSV", {
46 sql: z.string(), filename: z.string(),
47}, async ({ sql, filename }) => {
48 if (BLOCKED.test(sql)) return { content: [{ type: "text", text: "Only SELECT allowed" }], isError: true };
49 const r = await pool.query(sql);
50 const csv = [r.fields.map(f => f.name).join(","),
51 ...r.rows.map(row => r.fields.map(f => JSON.stringify(row[f.name] ?? "")).join(","))
52 ].join("\n");
53 const fs = await import("fs/promises");
54 await fs.writeFile(`/tmp/${filename}`, csv);
55 return { content: [{ type: "text", text: `Exported ${r.rowCount} rows to /tmp/${filename}` }] };
56});
57
58async function main() {
59 await server.connect(new StdioServerTransport());
60 console.error("Data analysis MCP server running");
61}
62main().catch(e => { console.error(e); process.exit(1); });

Common mistakes when using MCP for data analysis with AI

Why it's a problem: Running analysis queries without timeouts, blocking the database with expensive scans

How to avoid: Set a query timeout (15-30 seconds) on all analysis queries. Alert the user if a query is too complex.

Why it's a problem: Not adding LIMIT clauses, returning millions of rows that overwhelm the AI context

How to avoid: Auto-append LIMIT to all queries. For analysis, 100-1000 rows is usually sufficient for the AI to identify patterns.

Why it's a problem: Giving the analysis server write access to the production database

How to avoid: Use a read replica or a read-only user. Analysis should never modify production data.

Best practices

  • Use read-only database credentials for all analysis MCP servers
  • Auto-append LIMIT clauses to prevent returning too many rows
  • Set query timeouts to prevent expensive scans from blocking the database
  • Build pre-built analysis tools for common patterns (revenue, retention, funnels)
  • Return structured JSON so the AI can interpret and present data flexibly
  • Include a schema discovery tool so the AI understands table structures
  • Separate the analysis database from production with a read replica

Still stuck?

Copy one of these prompts to get a personalized, step-by-step explanation.

ChatGPT Prompt

Build an MCP server for data analysis in TypeScript. Include tools for schema discovery, read-only SQL queries with safety checks, a revenue trend analyzer with growth rates, and a CSV export tool. Use PostgreSQL.

MCP Prompt

Create a data analysis MCP server with query, schema, revenue_trend, and export_csv tools. Use PostgreSQL with read-only safety checks, automatic LIMIT clauses, and query timeouts. Return structured JSON results.

Frequently asked questions

Can the AI create visualizations from database queries?

The AI can describe charts and recommend visualization types. For actual charts, export data to CSV and open in a BI tool, or have the AI generate Python matplotlib code that you run separately.

How do I analyze data from multiple databases?

Connect multiple database MCP servers, each pointing to a different database. The AI can query each one and synthesize insights across data sources.

Is this suitable for large datasets (millions of rows)?

Yes, if queries use proper indexes and aggregation. The MCP server runs the SQL on the database, not in memory. Use LIMIT for raw queries and GROUP BY for aggregations.

Can RapidDev build custom data analysis MCP servers?

Yes. RapidDev builds analysis MCP servers with custom tools for specific business metrics, pre-built dashboards, and integration with data warehouses like Snowflake and BigQuery.

Can I use this with Snowflake or BigQuery instead of PostgreSQL?

Yes. Replace the pg client with the Snowflake SDK or BigQuery client library. The MCP tool interface remains the same — only the database driver changes.

RapidDev

Talk to an Expert

Our team has built 600+ apps. Get personalized help with your project.

Book a free consultation

Need help with your project?

Our experts have built 600+ apps and can accelerate your development. Book a free consultation — no strings attached.

Book a free consultation

We put the rapid in RapidDev

Need a dedicated strategic tech and growth partner? Discover what RapidDev can do for your business! Book a call with our team to schedule a free, no-obligation consultation. We'll discuss your project and provide a custom quote at no cost.