Use MCP Filesystem and Search servers to give AI assistants document search capabilities. The AI lists directory contents, reads file contents, and searches across documents by filename or content patterns. Configure the Filesystem MCP server pointed at your documents folder, ask natural language questions, and the AI finds and summarizes relevant documents without you digging through files manually.
Finding and Searching Documents with MCP
The simplest and most immediately useful MCP setup is connecting an AI assistant to your documents folder. Using the Filesystem MCP server, the AI can list files, read their contents, and search by name patterns. This tutorial shows how to set it up in minutes, then how to build a more advanced custom server with full-text content search for larger document collections.
Prerequisites
- Claude Desktop, Cursor, or Windsurf with MCP support
- A folder of documents (Markdown, text, PDF, code files) to search
- Node.js 18+ for custom server development (optional)
Step-by-step guide
Configure the Filesystem MCP server for your documents
Configure the Filesystem MCP server for your documents
The Filesystem MCP server is one of the official MCP servers maintained by the community. It provides tools to read files, list directories, search by filename, and get file metadata. Point it at your documents directory to give the AI access. You can allow multiple directories by passing additional path arguments.
1// Claude Desktop: ~/Library/Application Support/Claude/claude_desktop_config.json2{3 "mcpServers": {4 "documents": {5 "command": "npx",6 "args": [7 "-y", "@modelcontextprotocol/server-filesystem",8 "/Users/you/Documents",9 "/Users/you/Projects/docs"10 ]11 }12 }13}1415// For Cursor: .cursor/mcp.json16// For VS Code: use "servers" key instead of "mcpServers"Expected result: Filesystem MCP server provides read_file, list_directory, search_files, and get_file_info tools.
Search documents with natural language queries
Search documents with natural language queries
Once connected, ask the AI questions about your documents. It will use the MCP tools to list directories, search for relevant files, and read their contents. Start with broad questions and narrow down. The AI uses search_files for filename patterns and read_file to examine promising results.
1// Example questions to ask:23// File discovery:4// "What documents are in my Documents folder?"5// "Find all PDF files in my Projects directory"6// "List all Markdown files related to onboarding"78// Content search:9// "Find the document that talks about our pricing strategy"10// "Which files mention the Q1 2026 budget?"11// "Search for any documents about the API migration plan"1213// Summarization:14// "Read the meeting-notes-march.md file and summarize the key decisions"15// "Compare the contents of proposal-v1.md and proposal-v2.md"16// "What are the action items mentioned across all files in the meetings/ folder?"Expected result: The AI finds relevant documents by name and content, reads them, and answers your questions.
Build a custom MCP server with full-text content search
Build a custom MCP server with full-text content search
The Filesystem server searches by filename, not content. For large document collections, build a custom MCP server that indexes file contents and provides full-text search. Index documents at startup, then expose a search_content tool that finds files containing specific terms or phrases. This is much faster than having the AI read every file.
1// src/doc-search-server.ts2import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";3import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";4import { z } from "zod";5import fs from "fs/promises";6import path from "path";78interface IndexEntry {9 path: string;10 content: string;11 modified: Date;12}1314const index: IndexEntry[] = [];1516async function indexDirectory(dir: string): Promise<void> {17 const entries = await fs.readdir(dir, { withFileTypes: true, recursive: true });18 for (const entry of entries) {19 if (!entry.isFile()) continue;20 const ext = path.extname(entry.name).toLowerCase();21 if (!['.md', '.txt', '.json', '.ts', '.js', '.py', '.yaml', '.yml'].includes(ext)) continue;22 const fullPath = path.join(dir, entry.name);23 try {24 const content = await fs.readFile(fullPath, 'utf-8');25 const stat = await fs.stat(fullPath);26 index.push({ path: fullPath, content, modified: stat.mtime });27 } catch {}28 }29 console.error(`Indexed ${index.length} files from ${dir}`);30}3132const server = new McpServer({ name: "doc-search", version: "1.0.0" });3334server.tool("search_content", "Search documents by content", {35 query: z.string().describe("Search term or phrase"),36 maxResults: z.number().default(10),37}, async ({ query, maxResults }) => {38 const lower = query.toLowerCase();39 const results = index40 .filter(e => e.content.toLowerCase().includes(lower))41 .slice(0, maxResults)42 .map(e => {43 const idx = e.content.toLowerCase().indexOf(lower);44 const start = Math.max(0, idx - 100);45 const end = Math.min(e.content.length, idx + query.length + 100);46 return { file: e.path, excerpt: '...' + e.content.slice(start, end) + '...' };47 });48 return { content: [{ type: "text", text: JSON.stringify(results, null, 2) }] };49});5051async function main() {52 const dirs = process.argv.slice(2);53 for (const dir of dirs) await indexDirectory(dir);54 await server.connect(new StdioServerTransport());55 console.error("Document search server running");56}57main().catch(e => { console.error(e); process.exit(1); });Expected result: A custom MCP server that indexes file contents and provides fast full-text search.
Combine with AI for intelligent document summarization
Combine with AI for intelligent document summarization
The real power of MCP document search comes from combining search with AI summarization. The AI searches for relevant files, reads their contents, and then synthesizes answers from multiple documents. This creates a lightweight knowledge management system where you can ask questions across your entire document collection. For organizations with large document repositories, RapidDev builds custom MCP solutions that combine full-text search with vector embeddings for semantic search.
1// Example workflow the AI executes:2// 1. User asks: "What were the key decisions from last month's meetings?"3// 2. AI calls search_content with query "meeting" or "decisions"4// 3. AI reads the top matching files with read_file5// 4. AI synthesizes a summary across all meeting notes6// 5. AI returns a structured answer with source citations78// To enable this, configure both servers:9{10 "mcpServers": {11 "filesystem": {12 "command": "npx",13 "args": ["-y", "@modelcontextprotocol/server-filesystem", "/Users/you/Documents"]14 },15 "doc-search": {16 "command": "node",17 "args": ["dist/doc-search-server.js", "/Users/you/Documents"]18 }19 }20}Expected result: AI searches across documents and synthesizes answers from multiple sources with citations.
Complete working example
1import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";2import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";3import { z } from "zod";4import fs from "fs/promises";5import path from "path";67interface Doc { path: string; content: string; size: number; modified: string; }8const docs: Doc[] = [];910const EXTS = new Set(['.md', '.txt', '.json', '.ts', '.js', '.py', '.yaml', '.yml', '.csv']);1112async function indexDir(dir: string) {13 const entries = await fs.readdir(dir, { withFileTypes: true, recursive: true });14 for (const e of entries) {15 if (!e.isFile() || !EXTS.has(path.extname(e.name).toLowerCase())) continue;16 const p = path.join(dir, e.name);17 try {18 const [content, stat] = await Promise.all([fs.readFile(p, 'utf-8'), fs.stat(p)]);19 if (content.length < 5_000_000) docs.push({ path: p, content, size: stat.size, modified: stat.mtime.toISOString() });20 } catch {}21 }22 console.error(`Indexed ${docs.length} documents`);23}2425const server = new McpServer({ name: "doc-search", version: "1.0.0" });2627server.tool("search_content", "Full-text search across all indexed documents", {28 query: z.string(), maxResults: z.number().default(10),29}, async ({ query, maxResults }) => {30 const q = query.toLowerCase();31 const hits = docs.filter(d => d.content.toLowerCase().includes(q)).slice(0, maxResults);32 const results = hits.map(d => {33 const i = d.content.toLowerCase().indexOf(q);34 return { file: d.path, excerpt: d.content.slice(Math.max(0, i-100), i + query.length + 100) };35 });36 return { content: [{ type: "text", text: results.length ? JSON.stringify(results, null, 2) : "No matches found." }] };37});3839server.tool("list_indexed", "List all indexed documents with metadata", {}, async () => {40 const list = docs.map(d => ({ file: d.path, size: d.size, modified: d.modified }));41 return { content: [{ type: "text", text: JSON.stringify(list, null, 2) }] };42});4344server.tool("read_document", "Read a document's full content", {45 filePath: z.string(),46}, async ({ filePath }) => {47 const doc = docs.find(d => d.path === filePath || d.path.endsWith(filePath));48 if (!doc) return { content: [{ type: "text", text: "Document not found in index" }], isError: true };49 return { content: [{ type: "text", text: doc.content }] };50});5152async function main() {53 for (const dir of process.argv.slice(2)) await indexDir(dir);54 await server.connect(new StdioServerTransport());55 console.error("Document search MCP server ready");56}57main().catch(e => { console.error(e); process.exit(1); });Common mistakes when using MCP for AI-powered document search
Why it's a problem: Granting the Filesystem server access to the entire home directory, exposing sensitive files
How to avoid: Only allow access to specific document directories. Never include .ssh, .env files, or credential directories.
Why it's a problem: Trying to index binary files (images, compiled code), causing encoding errors
How to avoid: Filter by file extension and only index text-based formats (md, txt, json, ts, js, py, yaml, csv).
Why it's a problem: Not handling large files that exceed memory limits during indexing
How to avoid: Set a file size limit (e.g., 5MB) and skip files that exceed it. Log skipped files to stderr.
Best practices
- Limit Filesystem server access to specific directories containing documents only
- Use full-text search for large collections instead of reading every file
- Filter indexable file types to text-based formats only
- Include file metadata (size, modified date) in search results for context
- Return excerpts with surrounding context so the AI can judge relevance
- Combine search-by-name (fast) with search-by-content (thorough) for best results
- Set file size limits during indexing to prevent memory issues
Still stuck?
Copy one of these prompts to get a personalized, step-by-step explanation.
Set up the Filesystem MCP server to give Claude Desktop access to my documents folder. Then show me how to build a custom MCP server that indexes file contents and provides full-text search with excerpts. Use TypeScript and the MCP SDK.
Build a document search MCP server that indexes Markdown, text, and code files at startup, provides search_content and list_indexed tools, and returns search results with excerpts and file paths.
Frequently asked questions
Can the AI read PDF files through MCP?
The basic Filesystem server reads text files only. For PDFs, build a custom server that uses a PDF parsing library like pdf-parse to extract text before indexing.
How many files can the document search server handle?
The in-memory index works well for up to 10,000 files. Beyond that, use a proper search engine like Elasticsearch or MeiliSearch as the backend.
Does the Filesystem server let the AI modify my documents?
The Filesystem server includes write tools. If you want read-only access, use a custom server that only exposes read and search tools.
Can I search across documents in different formats?
Yes, as long as you extract text from each format. Build a custom server with parsers for Markdown, plain text, JSON, YAML, CSV, and any other formats you use.
Can RapidDev build a custom document search solution?
Yes. RapidDev builds document search MCP servers that combine full-text search, vector embeddings, and metadata filtering for enterprise document collections.
Talk to an Expert
Our team has built 600+ apps. Get personalized help with your project.
Book a free consultation