Skip to main content
RapidDev - Software Development Agency

How to Build a Reporting Tool with Replit

Build a custom report builder in Replit using Express and PostgreSQL in 1-2 hours. Users define reports by picking columns, adding filters, and choosing aggregates — the engine safely translates their selections into SQL queries using Drizzle's query builder. Exports to CSV or PDF with a Scheduled Deployment for emailed reports.

What you'll build

  • Data source registration that introspects PostgreSQL information_schema to return available columns and types
  • Report definition storage with user-selected columns, filters, group-by dimensions, aggregates, and sort options
  • Safe SQL query builder that translates report configs into parameterized Drizzle queries with no injection risk
  • CSV streaming export and PDF generation using pdfkit for downloaded report files
  • Report schedule system with SendGrid delivery for daily, weekly, and monthly email reports
  • Three-step React report wizard: pick columns, add filters, preview results and export
Book a free consultation
4.9Clutch rating
600+Happy partners
17+Countries served
190+Team members
Intermediate16 min read1-2 hoursReplit FreeApril 2026RapidDev Engineering Team
TL;DR

Build a custom report builder in Replit using Express and PostgreSQL in 1-2 hours. Users define reports by picking columns, adding filters, and choosing aggregates — the engine safely translates their selections into SQL queries using Drizzle's query builder. Exports to CSV or PDF with a Scheduled Deployment for emailed reports.

What you're building

A custom report builder solves a problem every growing business faces: your team wants to answer specific questions from your data, but they can't write SQL. A reporting tool gives non-technical users a way to define exactly the data slice they need — which columns to show, which filters to apply, and how to aggregate — without touching a database query.

Replit Agent builds the complete backend with four tables: data_sources (registered tables in your PostgreSQL), reports (saved report definitions as JSON), report_schedules (automated email delivery), and report_exports (generated file history). The execution engine is the critical piece — it reads a report's JSON config and builds a safe SQL query using Drizzle's parameterized query builder. Never string-interpolating user input into SQL is the most important security rule here.

The schema introspection feature makes the tool self-documenting: when a user picks a data source, the API queries PostgreSQL's information_schema.columns to return the actual column names and data types for that table. This means the column picker always shows accurate, up-to-date options without you maintaining a separate schema definition file. A Scheduled Deployment runs daily to check which reports have email delivery configured and are due to run — generating the report, attaching a CSV, and sending via SendGrid.

Final result

A custom report builder with schema introspection, SQL-safe query execution, CSV/PDF export, and scheduled email delivery — running on your Replit PostgreSQL database with no external BI tool costs.

Tech stack

ReplitIDE & Hosting
ExpressBackend Framework
PostgreSQLDatabase
Drizzle ORMDatabase ORM
Replit AuthAuth
SendGridScheduled Report Delivery

Prerequisites

  • A Replit account (Free tier is sufficient)
  • A SendGrid account (free tier) for scheduled report email delivery
  • SENDGRID_API_KEY added to Replit Secrets and Deployment Secrets
  • Some data in your PostgreSQL tables to report on — the tool works on your existing tables

Build steps

1

Scaffold the project with Replit Agent

Create a new Replit App and paste this prompt. Agent builds the complete reporting backend with all four tables and the React wizard frontend.

prompt.txt
1// Build a custom reporting tool with Express and PostgreSQL using Drizzle ORM.
2// Use Replit Auth for authentication.
3//
4// Tables:
5// 1. data_sources: id serial primary key, user_id text not null, name text not null,
6// type text default 'postgres' (enum: postgres/csv/api),
7// connection_config jsonb (stores table name for postgres type),
8// created_at timestamp default now()
9// 2. reports: id serial primary key, user_id text not null, name text not null,
10// data_source_id integer references data_sources not null,
11// columns jsonb not null (array of {field, label, aggregate: null/sum/count/avg/min/max}),
12// filters jsonb (array of {field, operator: eq/neq/gt/lt/gte/lte/like/in, value}),
13// group_by text[] (PostgreSQL text array of field names),
14// sort_by jsonb ({field, direction: asc/desc}),
15// created_at timestamp default now(), updated_at timestamp default now()
16// 3. report_schedules: id serial primary key,
17// report_id integer references reports not null,
18// frequency text not null (enum: daily/weekly/monthly),
19// email_to text not null, last_run_at timestamp, next_run_at timestamp
20// 4. report_exports: id serial primary key,
21// report_id integer references reports not null,
22// format text not null (enum: csv/pdf/json),
23// file_url text, created_at timestamp default now()
24//
25// Routes:
26// GET/POST /api/data-sources
27// GET /api/data-sources/:id/schema (introspect columns using information_schema)
28// GET/POST /api/reports, GET /api/reports/:id
29// POST /api/reports/:id/run (execute report, return JSON results)
30// POST /api/reports/:id/export (generate CSV or PDF)
31// POST /api/report-schedules
32//
33// Bind to 0.0.0.0:3000. Use Replit Auth.

Pro tip: After scaffolding, register your first data source via POST /api/data-sources with {name: 'Orders', type: 'postgres', connection_config: {table: 'orders'}}. Then call GET /api/data-sources/1/schema to see all columns auto-detected from information_schema.

Expected result: Running Express app with four tables. GET /api/data-sources returns empty array. Replit Auth login gate on all routes.

2

Build the schema introspection route

This route queries PostgreSQL's information_schema to return the columns and data types for any registered table — powering the dynamic column picker in the report wizard.

server/routes/dataSources.js
1const express = require('express');
2const { db } = require('../db');
3const { dataSources } = require('../schema');
4const { eq, and } = require('drizzle-orm');
5
6const router = express.Router();
7
8router.get('/api/data-sources/:id/schema', async (req, res) => {
9 if (!req.user) return res.status(401).json({ error: 'Login required' });
10
11 const [source] = await db.select().from(dataSources)
12 .where(and(
13 eq(dataSources.id, parseInt(req.params.id)),
14 eq(dataSources.userId, req.user.id)
15 )).limit(1);
16
17 if (!source) return res.status(404).json({ error: 'Data source not found' });
18
19 const tableName = source.connectionConfig?.table;
20 if (!tableName) return res.status(400).json({ error: 'No table configured for this source' });
21
22 // Validate table name is alphanumeric to prevent SQL injection
23 if (!/^[a-zA-Z0-9_]+$/.test(tableName)) {
24 return res.status(400).json({ error: 'Invalid table name' });
25 }
26
27 const columns = await db.execute({
28 sql: `
29 SELECT column_name, data_type, is_nullable, column_default
30 FROM information_schema.columns
31 WHERE table_schema = 'public'
32 AND table_name = $1
33 ORDER BY ordinal_position
34 `,
35 params: [tableName],
36 });
37
38 // Map PostgreSQL types to friendlier labels
39 const typeMap = {
40 'integer': 'number', 'bigint': 'number', 'numeric': 'number',
41 'text': 'string', 'character varying': 'string', 'varchar': 'string',
42 'timestamp with time zone': 'date', 'timestamp without time zone': 'date',
43 'date': 'date', 'boolean': 'boolean', 'jsonb': 'json', 'json': 'json',
44 'ARRAY': 'array',
45 };
46
47 const schema = columns.rows.map(col => ({
48 field: col.column_name,
49 type: typeMap[col.data_type] || col.data_type,
50 nullable: col.is_nullable === 'YES',
51 // Suggest aggregates based on type
52 supportedAggregates: typeMap[col.data_type] === 'number'
53 ? ['count', 'sum', 'avg', 'min', 'max']
54 : ['count'],
55 }));
56
57 return res.json({ sourceId: source.id, tableName, columns: schema });
58});
59
60module.exports = router;

Pro tip: The table name validation regex (alphanumeric + underscore only) is essential. Without it, a malicious user could inject SQL via the table name field — even though it's used in a parameterized query context, table names can't be parameterized in PostgreSQL.

Expected result: GET /api/data-sources/1/schema returns an array of columns with types. A table with 8 columns returns 8 schema entries, each with field name, type, and supported aggregates.

3

Build the safe SQL query executor

The report execution engine reads the report's JSON config and builds a parameterized SQL query using Drizzle's query builder. Never string-interpolate user values — always use parameterized queries.

server/routes/reports.js
1const express = require('express');
2const { db } = require('../db');
3const { reports, dataSources } = require('../schema');
4const { eq, and } = require('drizzle-orm');
5const { sql } = require('drizzle-orm');
6
7const router = express.Router();
8
9// Map operator names to SQL operators
10const OPERATORS = {
11 eq: '=', neq: '!=', gt: '>', lt: '<', gte: '>=', lte: '<=', like: 'LIKE',
12};
13
14router.post('/api/reports/:id/run', express.json(), async (req, res) => {
15 if (!req.user) return res.status(401).json({ error: 'Login required' });
16
17 const [report] = await db.select().from(reports)
18 .where(and(eq(reports.id, parseInt(req.params.id)), eq(reports.userId, req.user.id)))
19 .limit(1);
20 if (!report) return res.status(404).json({ error: 'Report not found' });
21
22 const [source] = await db.select().from(dataSources)
23 .where(eq(dataSources.id, report.dataSourceId)).limit(1);
24 const tableName = source?.connectionConfig?.table;
25
26 if (!tableName || !/^[a-zA-Z0-9_]+$/.test(tableName)) {
27 return res.status(400).json({ error: 'Invalid data source table' });
28 }
29
30 // Build SELECT clause
31 const columnDefs = report.columns;
32 const selectParts = columnDefs.map(col => {
33 const safeField = col.field.replace(/[^a-zA-Z0-9_]/g, '');
34 if (col.aggregate && col.aggregate !== 'none') {
35 return `${col.aggregate.toUpperCase()}(${safeField}) AS "${col.label || safeField}"`;
36 }
37 return `${safeField} AS "${col.label || safeField}"`;
38 });
39
40 // Build WHERE clause with parameterized values
41 const params = [];
42 const whereParts = (report.filters || []).map(filter => {
43 const safeField = filter.field.replace(/[^a-zA-Z0-9_]/g, '');
44 const op = OPERATORS[filter.operator] || '=';
45 params.push(filter.value);
46 return `${safeField} ${op} $${params.length}`;
47 });
48
49 // Build GROUP BY
50 const groupByFields = (report.groupBy || []).map(f => f.replace(/[^a-zA-Z0-9_]/g, ''));
51
52 // Build ORDER BY
53 let orderBy = '';
54 if (report.sortBy?.field) {
55 const safeSort = report.sortBy.field.replace(/[^a-zA-Z0-9_]/g, '');
56 const dir = report.sortBy.direction === 'desc' ? 'DESC' : 'ASC';
57 orderBy = `ORDER BY ${safeSort} ${dir}`;
58 }
59
60 const query = [
61 `SELECT ${selectParts.join(', ')}`,
62 `FROM ${tableName}`,
63 whereParts.length ? `WHERE ${whereParts.join(' AND ')}` : '',
64 groupByFields.length ? `GROUP BY ${groupByFields.join(', ')}` : '',
65 orderBy,
66 'LIMIT 1000',
67 ].filter(Boolean).join('\n');
68
69 const result = await db.execute({ sql: query, params });
70 return res.json({ rows: result.rows, count: result.rows.length });
71});
72
73module.exports = router;

Pro tip: The LIMIT 1000 at the end prevents memory exhaustion on large tables. For exports, the CSV streaming route can remove this limit and stream rows in chunks of 500 to avoid loading the entire result into memory.

Expected result: POST /api/reports/1/run returns {rows: [{column1: val, column2: val}], count: 47} based on the report's saved configuration.

4

Add CSV export and scheduled report delivery

The export route generates a CSV from the report results. A Scheduled Deployment checks for due report schedules and emails the CSV attachment via SendGrid.

prompt.txt
1// Ask Agent to add CSV export and scheduled delivery with this prompt:
2// Add to server/routes/reports.js:
3//
4// POST /api/reports/:id/export:
5// 1. Require auth, load report config
6// 2. Run the same SQL query as /run (extract to a shared executeReport(reportId) function)
7// 3. If format='csv':
8// - Generate CSV string: first row = column labels, remaining rows = data values
9// - Use Array.join() to build CSV, escape values with quotes if they contain commas
10// - Set Content-Type: text/csv, Content-Disposition: attachment; filename=report.csv
11// - Send the CSV string directly (no file storage needed for on-demand exports)
12// 4. If format='pdf':
13// - Use pdfkit npm package
14// - Create a PDF doc with the report name as title, run date, and data as a simple table
15// - Send as application/pdf response
16// 5. Insert into report_exports table with format and created_at
17//
18// Create server/jobs/sendScheduledReports.js:
19// 1. Query report_schedules where next_run_at <= now()
20// 2. For each due schedule, call executeReport(scheduleId) to get rows
21// 3. Generate CSV attachment
22// 4. Send email via SendGrid (SENDGRID_API_KEY from process.env) with CSV attachment
23// 5. Update last_run_at = now(), calculate next_run_at based on frequency
24// (daily: + 1 day, weekly: + 7 days, monthly: + 1 month)
25// 6. Deploy as Scheduled Deployment running every hour

Pro tip: For the CSV export, handle commas and newlines in cell values by wrapping all values in double quotes and escaping internal double quotes as '""'. This prevents broken CSV files when report data contains punctuation.

Expected result: POST /api/reports/1/export with {format:'csv'} triggers a CSV file download. The Scheduled Deployment sends the report CSV via email at the configured frequency.

5

Build the three-step React report wizard

Ask Agent to create the React frontend with the wizard for building reports and a saved reports list.

prompt.txt
1// Ask Agent to build the React frontend with this prompt:
2// Build a React reporting tool frontend with three pages:
3//
4// 1. Reports List page (/):
5// - List of user's saved reports with name, data source, and created date
6// - Each row has: Run (executes and shows results), Export CSV, Export PDF, Schedule buttons
7// - 'Create Report' button navigates to the wizard
8//
9// 2. Report Builder Wizard (/reports/new and /reports/:id/edit):
10// Step 1 — Select Data Source and Columns:
11// - Dropdown of GET /api/data-sources sources
12// - On selection, fetch GET /api/data-sources/:id/schema for column list
13// - Checkbox list of columns, each with a label input and aggregate selector dropdown
14// (none/count/sum/avg/min/max — only show numeric aggregates for number-type columns)
15//
16// Step 2 — Add Filters:
17// - 'Add Filter' button adds a row: field selector, operator selector, value input
18// - Operators: equals, not equals, greater than, less than, contains, is empty
19// - A 'Sort by' row with field selector and ASC/DESC toggle
20//
21// Step 3 — Preview and Save:
22// - 'Preview' button calls POST /api/reports/:id/run and shows results in a data table
23// - Results table with sortable column headers
24// - Row count badge
25// - 'Save Report' saves the config, 'Export CSV' triggers download
26//
27// 3. Schedule Dialog:
28// - Opens when clicking 'Schedule' on a saved report
29// - Frequency selector (daily/weekly/monthly)
30// - Email input
31// - Calls POST /api/report-schedules on save

Expected result: The wizard shows column checkboxes populated from schema introspection. Adding filters shows the correct operators. Preview renders a data table with real query results.

Complete code

server/routes/dataSources.js
1const express = require('express');
2const { db } = require('../db');
3const { dataSources } = require('../schema');
4const { eq, and } = require('drizzle-orm');
5const { withDbRetry } = require('../lib/retryDb');
6
7const router = express.Router();
8
9const TYPE_MAP = {
10 integer: 'number', bigint: 'number', numeric: 'number', real: 'number',
11 text: 'string', 'character varying': 'string', varchar: 'string', char: 'string',
12 'timestamp with time zone': 'date', 'timestamp without time zone': 'date', date: 'date',
13 boolean: 'boolean', jsonb: 'json', json: 'json', 'ARRAY': 'array',
14};
15
16router.get('/api/data-sources', async (req, res) => {
17 if (!req.user) return res.status(401).json({ error: 'Login required' });
18 const rows = await db.select().from(dataSources)
19 .where(eq(dataSources.userId, req.user.id))
20 .orderBy(dataSources.name);
21 return res.json({ sources: rows });
22});
23
24router.post('/api/data-sources', express.json(), async (req, res) => {
25 if (!req.user) return res.status(401).json({ error: 'Login required' });
26 const { name, type, connectionConfig } = req.body;
27 if (!name || !type) return res.status(400).json({ error: 'name and type are required' });
28 if (type === 'postgres' && !/^[a-zA-Z0-9_]+$/.test(connectionConfig?.table)) {
29 return res.status(400).json({ error: 'Invalid table name — alphanumeric and underscores only' });
30 }
31 const row = await withDbRetry(() =>
32 db.insert(dataSources).values({
33 userId: req.user.id, name, type,
34 connectionConfig: connectionConfig || null,
35 }).returning()
36 );
37 return res.status(201).json(row[0]);
38});
39
40router.get('/api/data-sources/:id/schema', async (req, res) => {
41 if (!req.user) return res.status(401).json({ error: 'Login required' });
42 const [source] = await db.select().from(dataSources)
43 .where(and(eq(dataSources.id, parseInt(req.params.id)), eq(dataSources.userId, req.user.id)))
44 .limit(1);
45 if (!source) return res.status(404).json({ error: 'Data source not found' });
46 const tableName = source.connectionConfig?.table;
47 if (!tableName || !/^[a-zA-Z0-9_]+$/.test(tableName)) {
48 return res.status(400).json({ error: 'Invalid table name in data source config' });
49 }
50 const result = await db.execute({
51 sql: `SELECT column_name, data_type, is_nullable FROM information_schema.columns
52 WHERE table_schema = 'public' AND table_name = $1 ORDER BY ordinal_position`,
53 params: [tableName],
54 });
55 const columns = result.rows.map(col => ({
56 field: col.column_name,
57 type: TYPE_MAP[col.data_type] || col.data_type,
58 nullable: col.is_nullable === 'YES',
59 supportedAggregates: TYPE_MAP[col.data_type] === 'number'
60 ? ['none', 'count', 'sum', 'avg', 'min', 'max']
61module.exports = router;

Customization ideas

Report sharing via public link

Add a share_token column to reports. A GET /public/reports/:token/run endpoint executes the report without auth, returning JSON data — so stakeholders can embed report data in external dashboards without a Replit account.

Chart visualization layer

After the results table, add a 'Create Chart' button that opens a chart configurator. Users pick a chart type (bar, line, pie) and map report columns to axes. Render using recharts in the preview panel.

CSV data source support

Allow users to upload a CSV file as a data source. Parse the CSV with csv-parse, store rows in a dynamic_data table with dataset_id, and treat the column headers as the schema for the report builder.

Report versioning

Add a report_versions table that stores snapshots of the report config on each save. Users can view version history and restore a previous config if they accidentally break a useful report.

Common pitfalls

Pitfall: String-interpolating user-selected field names directly into SQL

How to avoid: Always sanitize field names with a regex that only allows alphanumeric characters and underscores: field.replace(/[^a-zA-Z0-9_]/g, ''). Never use raw user input as table or column names without this validation.

Pitfall: Running reports without a LIMIT clause

How to avoid: Always add LIMIT 1000 (or configurable max) to the generated query for preview. For exports, stream rows in batches of 500 rather than loading the full result set at once.

Pitfall: Using the same data source for the reporting tool and the application it reports on

How to avoid: Reporting queries use SELECT only, so they shouldn't lock rows. But limit max concurrent report queries to 3 using a simple in-memory counter middleware to prevent overloading the connection pool.

Pitfall: Forgetting to add SENDGRID_API_KEY to Deployment Secrets

How to avoid: Add SENDGRID_API_KEY to both Workspace Secrets (lock icon) and Deployment Secrets (Publish pane → Secrets). Redeploy after adding.

Best practices

  • Validate all user-provided field names and table names against a strict alphanumeric regex before using them in any SQL context — never trust user input for identifiers.
  • Always add LIMIT to generated queries. Let users configure a higher limit (up to 10,000) only for exports, and stream those in chunks.
  • Use PostgreSQL information_schema for column discovery — it's always accurate and eliminates maintenance of a separate schema definition file.
  • Store report configs as JSONB rather than separate relational columns — report definitions change shape often, and JSONB handles this without schema migrations.
  • Deploy on Autoscale for the main report API. Create a separate Scheduled Deployment for the email delivery job — run every hour to check for due schedules.
  • Add SENDGRID_API_KEY to Replit Secrets via the lock icon in the sidebar, and remember to add it again in Deployment Secrets in the Publish pane — they are completely separate.
  • Use Drizzle Studio (Database tab in Replit sidebar) to inspect report_schedules and verify next_run_at is being calculated correctly after each delivery.

AI prompts to try

Copy these prompts to build this project faster.

ChatGPT Prompt

I'm building a custom report builder with Express and PostgreSQL. I have a reports table where each report has columns as JSONB (array of {field, label, aggregate}), filters as JSONB (array of {field, operator, value}), group_by as text[] array, and sort_by as JSONB ({field, direction}). Help me write a JavaScript function buildReportQuery(report, tableName) that translates this config into a safe parameterized SQL string and params array using Drizzle ORM's sql template tag. Field names must be sanitized to alphanumeric+underscore only before being used in the query.

Build Prompt

Add a report templates library to the reporting tool. Create a report_templates table (id serial, name text, description text, category text, data_source_type text, config jsonb — contains the full columns/filters/groupBy/sortBy structure, is_public boolean default true). Pre-populate with 5 templates: 'Monthly Revenue Summary', 'Top Customers by Order Count', 'Weekly Signups by Source', 'Product Sales by Category', 'Refund Rate by Month'. Build a GET /api/templates route that returns public templates, a POST /api/templates/:id/fork route that creates a copy of the template as a user's saved report (they can then customize it), and a React Templates page showing template cards with name, description, and a 'Use Template' button.

Frequently asked questions

Can I report on tables from a different database, not Replit's built-in PostgreSQL?

The current build uses Replit's built-in PostgreSQL and the information_schema for introspection. To connect to an external database, store the external connection string in Replit Secrets and create a second Drizzle client in server/externalDb.js. Note that external databases require whitelisting 0.0.0.0/0 since Replit has dynamic outbound IPs.

Is the report execution safe from SQL injection?

Yes, if you follow the pattern in step 3: sanitize all field and table names with regex (only alphanumeric + underscore), use parameterized queries for all filter values via the params array, and never string-interpolate user input directly into SQL strings. The OPERATORS whitelist ensures only valid comparison operators are used.

What Replit plan do I need?

Free tier is sufficient. The main app runs on Autoscale, and the Scheduled Deployment for email delivery also runs on Free tier. SendGrid has a free tier of 100 emails/day — enough for most scheduled report needs.

Can non-technical users add data to the reporting tool without SQL?

That's the whole point — the three-step wizard lets users pick columns, add filter conditions, and choose aggregates through dropdowns and checkboxes. They never write SQL. The introspection API shows only the columns that actually exist in the table.

How do I handle reports that join multiple tables?

The current build supports single-table reports. For joins, extend the report schema with a joins JSONB field (array of {table, join_type, on_field}). The query builder would need to generate JOIN clauses for each entry. This is the main complexity increase for a v2 build.

Should I deploy on Autoscale or Reserved VM?

Autoscale works for the main report API. Reports are run on-demand and users tolerate a short cold-start wait. The scheduled report delivery job runs as a separate Scheduled Deployment, completely independent of the main app's deployment target.

Can RapidDev help build a custom reporting tool for my business?

Yes. RapidDev builds custom reporting systems with multi-table joins, role-based access to specific data sources, white-labeled report exports, and integration with your existing BI stack. 600+ apps built, free consultation available.

What happens if a report returns too many rows and crashes the server?

The LIMIT 1000 in the query execution prevents loading too many rows for preview. For CSV exports, stream rows in batches of 500 using a cursor-based approach rather than loading the full result set. Add a 30-second query timeout on the database connection to prevent runaway queries.

RapidDev

Talk to an Expert

Our team has built 600+ apps. Get personalized help with your project.

Book a free consultation

Need help building your app?

Our experts have built 600+ apps and can accelerate your development. Book a free consultation — no strings attached.

Book a free consultation

We put the rapid in RapidDev

Need a dedicated strategic tech and growth partner? Discover what RapidDev can do for your business! Book a call with our team to schedule a free, no-obligation consultation. We'll discuss your project and provide a custom quote at no cost.