Track LLM call reliability in n8n by building a monitoring workflow that logs every API call's success, failure, latency, and token usage to a database. Use the Error Trigger node to capture failures, a Code node to compute metrics, and a scheduled aggregation workflow to generate daily success rate reports and alerts when reliability drops below thresholds.
Why You Need to Monitor LLM Call Success Rates
LLM APIs are inherently unreliable — they experience rate limits, timeouts, server errors, and content filtering blocks. Without monitoring, you have no visibility into how often your n8n workflows succeed or fail, how much latency users experience, or how much you are spending on API calls. This tutorial shows how to build a complete monitoring system inside n8n itself, using error workflows, logging nodes, and a daily aggregation workflow that alerts you when success rates drop below acceptable thresholds.
Prerequisites
- A running n8n instance (self-hosted or cloud) on version 1.30 or later
- A PostgreSQL or MySQL database accessible from n8n
- At least one workflow that calls an LLM API (OpenAI, Claude, Mistral, etc.)
- Basic understanding of SQL INSERT and SELECT queries
- An email or Slack credential for alert notifications
Step-by-step guide
Create the logging database table
Create the logging database table
Create a PostgreSQL table to store LLM call logs. Each row represents one API call with its outcome, latency, token usage, and error details. Use the Postgres node in a setup workflow or run the SQL directly in your database client. The table schema captures everything needed for success rate calculations, cost tracking, and failure analysis.
1-- PostgreSQL: Create the LLM call log table2CREATE TABLE IF NOT EXISTS llm_call_logs (3 id SERIAL PRIMARY KEY,4 workflow_id VARCHAR(255) NOT NULL,5 workflow_name VARCHAR(255),6 node_name VARCHAR(255),7 provider VARCHAR(50) NOT NULL, -- openai, anthropic, mistral, cohere8 model VARCHAR(100),9 status VARCHAR(20) NOT NULL, -- success, error, timeout, rate_limited10 error_message TEXT,11 error_code VARCHAR(10),12 latency_ms INTEGER,13 prompt_tokens INTEGER,14 completion_tokens INTEGER,15 total_tokens INTEGER,16 session_id VARCHAR(255),17 created_at TIMESTAMP DEFAULT NOW()18);1920CREATE INDEX idx_llm_logs_created ON llm_call_logs(created_at);21CREATE INDEX idx_llm_logs_status ON llm_call_logs(status);22CREATE INDEX idx_llm_logs_provider ON llm_call_logs(provider);Expected result: The llm_call_logs table exists in your PostgreSQL database with indexes for efficient querying.
Add success logging after each LLM node
Add success logging after each LLM node
After each LLM API call in your workflow (OpenAI node, HTTP Request to Claude, etc.), add a Code node followed by a Postgres node to log the successful call. The Code node extracts the relevant metrics (latency, tokens, model) from the API response, and the Postgres node inserts them into the log table. Connect this logging branch to the success output of your LLM node so it only fires on successful calls.
1// Code node — JavaScript2// Extract metrics from a successful LLM call3// Place after your OpenAI/Claude/Mistral node45const item = $input.first();6const startTime = $('Webhook').first().json._startTime || Date.now();7const latencyMs = Date.now() - startTime;89// Extract token usage (format varies by provider)10const usage = item.json.usage || item.json.message?.usage || {};1112return [{13 json: {14 workflow_id: $workflow.id,15 workflow_name: $workflow.name,16 node_name: $node.name,17 provider: 'openai', // Change per provider18 model: item.json.model || 'unknown',19 status: 'success',20 error_message: null,21 error_code: null,22 latency_ms: latencyMs,23 prompt_tokens: usage.prompt_tokens || usage.input_tokens || 0,24 completion_tokens: usage.completion_tokens || usage.output_tokens || 0,25 total_tokens: (usage.prompt_tokens || usage.input_tokens || 0) + (usage.completion_tokens || usage.output_tokens || 0),26 session_id: $('Webhook').first().json.sessionId || null27 }28}];Expected result: Every successful LLM call is logged to PostgreSQL with latency, token usage, and model information.
Set up the Error Trigger workflow for failure logging
Set up the Error Trigger workflow for failure logging
Create a separate Error Workflow that is triggered whenever any LLM workflow fails. Go to Workflow Settings in each LLM workflow and set the Error Workflow to this new workflow. The Error Trigger node receives the full error details including the failed node name, error message, and execution ID. Parse the error to categorize it (timeout, rate_limit, auth_error, server_error) and log it to the same database table.
1// Code node in Error Workflow — JavaScript2// Categorize and log LLM failures34const errorData = $input.first().json;5const errorMessage = errorData.execution?.error?.message || 'Unknown error';6const nodeName = errorData.execution?.lastNodeExecuted || 'Unknown';78// Categorize the error9let status = 'error';10let errorCode = '';1112if (errorMessage.includes('ETIMEDOUT') || errorMessage.includes('timeout')) {13 status = 'timeout';14 errorCode = '408';15} else if (errorMessage.includes('429') || errorMessage.includes('rate limit')) {16 status = 'rate_limited';17 errorCode = '429';18} else if (errorMessage.includes('401') || errorMessage.includes('Unauthorized')) {19 status = 'auth_error';20 errorCode = '401';21} else if (errorMessage.includes('500') || errorMessage.includes('Internal Server')) {22 status = 'server_error';23 errorCode = '500';24} else if (errorMessage.includes('529') || errorMessage.includes('overloaded')) {25 status = 'server_error';26 errorCode = '529';27}2829return [{30 json: {31 workflow_id: errorData.workflow?.id || 'unknown',32 workflow_name: errorData.workflow?.name || 'unknown',33 node_name: nodeName,34 provider: nodeName.toLowerCase().includes('openai') ? 'openai'35 : nodeName.toLowerCase().includes('claude') ? 'anthropic'36 : nodeName.toLowerCase().includes('mistral') ? 'mistral'37 : 'unknown',38 model: 'unknown',39 status,40 error_message: errorMessage.substring(0, 500),41 error_code: errorCode,42 latency_ms: null,43 prompt_tokens: 0,44 completion_tokens: 0,45 total_tokens: 0,46 session_id: null47 }48}];Expected result: Every LLM workflow failure is automatically categorized and logged to the same llm_call_logs table.
Build the daily aggregation and alerting workflow
Build the daily aggregation and alerting workflow
Create a scheduled workflow that runs daily (using the Schedule Trigger node set to run at 9:00 AM). It queries the llm_call_logs table for the last 24 hours, computes success rates per provider and per workflow, and sends an alert via email or Slack if any success rate drops below 95%. This gives you a daily reliability report without needing external monitoring tools.
1-- SQL query for the Postgres node2-- Daily success rate aggregation3SELECT4 provider,5 COUNT(*) as total_calls,6 COUNT(*) FILTER (WHERE status = 'success') as successful_calls,7 ROUND(8 COUNT(*) FILTER (WHERE status = 'success')::numeric / COUNT(*)::numeric * 100, 29 ) as success_rate,10 COUNT(*) FILTER (WHERE status = 'timeout') as timeouts,11 COUNT(*) FILTER (WHERE status = 'rate_limited') as rate_limits,12 COUNT(*) FILTER (WHERE status = 'error') as errors,13 ROUND(AVG(latency_ms) FILTER (WHERE status = 'success'), 0) as avg_latency_ms,14 SUM(total_tokens) as total_tokens_used15FROM llm_call_logs16WHERE created_at >= NOW() - INTERVAL '24 hours'17GROUP BY provider18ORDER BY success_rate ASC;Expected result: A daily report showing success rate, average latency, and token usage per LLM provider, with alerts for low reliability.
Add alerting logic for low success rates
Add alerting logic for low success rates
After the aggregation query, add a Code node that checks if any provider's success rate is below your threshold (e.g., 95%). If so, format an alert message and route it to the Send Email or Slack node. Use the IF node to branch: if all rates are above threshold, skip the alert. This prevents alert fatigue while ensuring you are notified of real problems.
1// Code node — JavaScript2// Check success rates and generate alert if needed34const items = $input.all();5const THRESHOLD = 95; // Alert if success rate drops below 95%67const alerts = [];8const summary = [];910for (const item of items) {11 const rate = parseFloat(item.json.success_rate);12 const provider = item.json.provider;1314 summary.push(15 `${provider}: ${rate}% success (${item.json.total_calls} calls, ${item.json.avg_latency_ms}ms avg latency)`16 );1718 if (rate < THRESHOLD) {19 alerts.push(20 `⚠ ${provider}: ${rate}% success rate (${item.json.errors} errors, ${item.json.timeouts} timeouts, ${item.json.rate_limits} rate limits)`21 );22 }23}2425return [{26 json: {27 hasAlerts: alerts.length > 0,28 alertMessage: alerts.length > 029 ? `LLM Success Rate Alert\n\n${alerts.join('\n')}\n\nFull Summary:\n${summary.join('\n')}`30 : null,31 summaryMessage: `Daily LLM Report\n\n${summary.join('\n')}`,32 alertCount: alerts.length33 }34}];Expected result: An alert is generated only when a provider's success rate drops below 95%, with a full summary of all providers included.
Complete working example
1// Complete Code node: Universal LLM call logger2// Place after any LLM node. Works with OpenAI, Claude, Mistral, Cohere.34const item = $input.first();5const response = item.json;67// Detect provider from response structure8let provider = 'unknown';9let model = 'unknown';10let promptTokens = 0;11let completionTokens = 0;1213if (response.model?.includes('gpt') || response.model?.includes('o1')) {14 provider = 'openai';15 model = response.model;16 promptTokens = response.usage?.prompt_tokens || 0;17 completionTokens = response.usage?.completion_tokens || 0;18} else if (response.model?.includes('claude')) {19 provider = 'anthropic';20 model = response.model;21 promptTokens = response.usage?.input_tokens || 0;22 completionTokens = response.usage?.output_tokens || 0;23} else if (response.model?.includes('mistral')) {24 provider = 'mistral';25 model = response.model;26 promptTokens = response.usage?.prompt_tokens || 0;27 completionTokens = response.usage?.completion_tokens || 0;28} else if (response.generation_id) {29 provider = 'cohere';30 model = response.model || 'command';31 promptTokens = response.meta?.billed_units?.input_tokens || 0;32 completionTokens = response.meta?.billed_units?.output_tokens || 0;33}3435// Calculate latency if start time was recorded36const startTime = item.json._startTime || $('Set Start Time')?.first()?.json?._startTime;37const latencyMs = startTime ? Date.now() - startTime : null;3839return [{40 json: {41 workflow_id: $workflow.id,42 workflow_name: $workflow.name,43 node_name: $node.name,44 provider,45 model,46 status: 'success',47 error_message: null,48 error_code: null,49 latency_ms: latencyMs,50 prompt_tokens: promptTokens,51 completion_tokens: completionTokens,52 total_tokens: promptTokens + completionTokens,53 session_id: null,54 created_at: new Date().toISOString()55 }56}];Common mistakes when monitoring Success Rates of Language Model Calls in n8n
Why it's a problem: Only logging failures without logging successes, making success rate calculation impossible
How to avoid: Log every LLM call — successes and failures — to get accurate denominators for success rate percentages
Why it's a problem: Putting the logging Postgres node in the main execution path, where a DB failure blocks the LLM response
How to avoid: Use the 'Continue On Fail' setting on the logging node so database issues do not block the main workflow
Why it's a problem: Not setting the Error Workflow in LLM workflow settings, missing failures that crash the workflow
How to avoid: Go to each LLM workflow's Settings and set the Error Workflow to your error logging workflow
Why it's a problem: Alerting on every single failure instead of aggregating to success rates
How to avoid: Use the daily aggregation workflow to check rates — individual failures are normal and expected
Why it's a problem: Not categorizing errors (timeout vs rate limit vs auth), making root cause analysis difficult
How to avoid: Parse error messages in the Error Workflow Code node and assign status categories (timeout, rate_limited, auth_error)
Best practices
- Log both successes and failures to the same table for accurate success rate calculations
- Use the Error Trigger node for failure logging — it captures errors that crash the workflow
- Set a 95% success rate threshold for alerts — below this indicates a systemic issue
- Track latency separately from success/failure — high latency is a leading indicator of future failures
- Index the created_at and status columns for fast aggregation queries
- Run the aggregation workflow daily, not hourly — hourly alerts cause fatigue without actionable insight
- Include token usage in logs to track API costs alongside reliability
- Retain logs for at least 90 days to identify long-term trends
Still stuck?
Copy one of these prompts to get a personalized, step-by-step explanation.
I want to monitor the success rate of LLM API calls in my n8n workflows. How do I log every call to a database, set up error tracking, and build a daily success rate report with alerts?
How do I use n8n's Error Trigger node to log failed LLM calls to PostgreSQL and build a scheduled workflow that computes daily success rates per provider?
Frequently asked questions
Can I use n8n's built-in execution history instead of a custom logging table?
n8n's execution history shows workflow-level results but does not aggregate metrics like success rates, average latency, or token usage. Custom logging provides the structured data needed for monitoring and alerting.
How much storage does the logging table use?
Each log row is approximately 500 bytes. At 1,000 LLM calls per day, that is about 500KB/day or 15MB/month — negligible for any PostgreSQL instance.
Should I log LLM calls from test executions?
Add a flag to exclude test executions from production metrics. Check if the execution was triggered manually (test) or by a trigger (production) and filter accordingly in your aggregation queries.
What is a normal success rate for LLM API calls?
A well-configured workflow should achieve 97-99% success rates. Rates below 95% indicate a systemic issue (wrong model, insufficient retries, rate limit misconfiguration). Rates below 90% require immediate investigation.
Can I send monitoring data to external tools like Grafana or Datadog?
Yes, you can extend the logging workflow to send metrics to external monitoring tools via HTTP Request nodes. Send data to Grafana Cloud's push API, Datadog's metrics endpoint, or any other monitoring service.
Can RapidDev help build a comprehensive monitoring system for n8n LLM workflows?
Yes, RapidDev builds enterprise-grade monitoring for n8n including LLM call tracking, cost analytics, latency dashboards, and alerting. Their team can integrate monitoring with Grafana, Datadog, or custom dashboards tailored to your workflow architecture.
Talk to an Expert
Our team has built 600+ apps. Get personalized help with your project.
Book a free consultation