To set concurrency limits in n8n, configure the N8N_CONCURRENCY_PRODUCTION_LIMIT environment variable to cap how many workflow executions run simultaneously. This prevents resource exhaustion on busy instances by queuing excess executions in FIFO order. Set the value in your .env file or docker-compose.yml and restart n8n to apply.
Controlling Parallel Workflow Executions with Concurrency Limits in n8n
When multiple webhook calls, cron triggers, or manual executions fire at the same time, n8n attempts to run all of them in parallel. On resource-constrained servers, this can spike memory usage, exhaust database connections, and cause workflows to fail. The N8N_CONCURRENCY_PRODUCTION_LIMIT environment variable tells n8n to queue excess executions and process them in first-in-first-out order, giving you predictable resource consumption and stable performance.
Prerequisites
- A self-hosted n8n instance (v1.22 or later)
- Access to the server's environment variables or docker-compose.yml
- Basic understanding of n8n workflow execution lifecycle
- Permission to restart the n8n process
Step-by-step guide
Determine your server's capacity
Determine your server's capacity
Before setting a concurrency limit, assess how many parallel executions your server can handle. Check your available RAM and CPU cores. A general rule of thumb is to allow 1-2 concurrent executions per CPU core for CPU-bound workflows, or 5-10 per core for I/O-bound workflows that mostly wait on HTTP requests. If your workflows call LLM APIs, they are I/O-bound and can tolerate higher concurrency. If they process large files or run heavy Code nodes, keep concurrency lower. Monitor your server's memory usage during peak load to find the sweet spot.
Expected result: You have a target concurrency number based on your server's resources and workflow characteristics.
Set N8N_CONCURRENCY_PRODUCTION_LIMIT in your environment
Set N8N_CONCURRENCY_PRODUCTION_LIMIT in your environment
Add the N8N_CONCURRENCY_PRODUCTION_LIMIT variable to your environment configuration. For Docker deployments, add it to your docker-compose.yml under the environment section. For npm-based installations, add it to your .env file in the n8n data directory. The value is an integer representing the maximum number of production workflow executions that can run at the same time. Set it to -1 to disable the limit entirely (the default behavior). Any value of 1 or higher enables the queue.
1# docker-compose.yml2version: '3.8'3services:4 n8n:5 image: docker.n8n.io/n8nio/n8n6 ports:7 - '5678:5678'8 environment:9 - N8N_CONCURRENCY_PRODUCTION_LIMIT=510 - DB_TYPE=postgresdb11 - DB_POSTGRESDB_HOST=postgres12 - DB_POSTGRESDB_DATABASE=n8n13 - DB_POSTGRESDB_USER=n8n14 - DB_POSTGRESDB_PASSWORD=${POSTGRES_PASSWORD}15 - N8N_ENCRYPTION_KEY=${N8N_ENCRYPTION_KEY}16 volumes:17 - n8n_data:/home/node/.n8n1819# Or in .env file for npm installations:20# N8N_CONCURRENCY_PRODUCTION_LIMIT=5Expected result: The environment variable is configured in your deployment file.
Restart n8n to apply the concurrency limit
Restart n8n to apply the concurrency limit
The concurrency limit is read at startup and cannot be changed at runtime. Restart your n8n instance for the new setting to take effect. For Docker Compose deployments, run docker compose down followed by docker compose up -d. For npm installations, stop the n8n process and start it again. For systemd-managed installations, use systemctl restart n8n. After restarting, check the n8n logs to confirm the concurrency limit was applied. n8n logs a message at startup indicating the production concurrency limit.
1# Docker Compose2docker compose down && docker compose up -d34# npm / systemd5systemctl restart n8n67# Check logs for confirmation8docker compose logs n8n | grep -i concurrencyExpected result: n8n restarts and logs confirm the concurrency limit is active. You see a log line referencing the production concurrency limit value.
Test the concurrency limit with parallel webhook triggers
Test the concurrency limit with parallel webhook triggers
Create a simple test workflow with a Webhook trigger node and a Wait node set to 10 seconds. Activate the workflow and send multiple simultaneous requests to the production webhook URL using curl or a tool like Apache Bench. Send more requests than your concurrency limit to verify that excess executions are queued rather than rejected. Open the n8n Executions panel to observe that only N executions run at a time while the rest wait in the queue.
1# Send 10 simultaneous requests to test a limit of 52for i in $(seq 1 10); do3 curl -s -X POST https://your-n8n.example.com/webhook/test-concurrency \4 -H 'Content-Type: application/json' \5 -d '{"request_number": '"$i"'}' &6done7wait8echo "All requests sent"Expected result: The Executions panel shows 5 running and 5 waiting (queued). As running executions complete, queued ones start automatically.
Monitor queued executions and adjust the limit
Monitor queued executions and adjust the limit
After running under the concurrency limit for a few days, review execution patterns to fine-tune the value. Check the Executions panel for executions that spent excessive time in the queue, which indicates the limit is too low. Also check server resource usage to see if there is headroom to increase the limit. If webhook callers time out waiting for responses, you may need to increase the limit or switch the Webhook node's response mode to Immediately so callers get a fast acknowledgment while the workflow runs in the background.
Expected result: You have a stable concurrency limit that balances server resources with acceptable queue wait times.
Complete working example
1version: '3.8'23services:4 n8n:5 image: docker.n8n.io/n8nio/n8n6 restart: unless-stopped7 ports:8 - '5678:5678'9 environment:10 # Concurrency control11 - N8N_CONCURRENCY_PRODUCTION_LIMIT=51213 # Database (PostgreSQL recommended for production)14 - DB_TYPE=postgresdb15 - DB_POSTGRESDB_HOST=postgres16 - DB_POSTGRESDB_PORT=543217 - DB_POSTGRESDB_DATABASE=n8n18 - DB_POSTGRESDB_USER=n8n19 - DB_POSTGRESDB_PASSWORD=${POSTGRES_PASSWORD}2021 # Security22 - N8N_ENCRYPTION_KEY=${N8N_ENCRYPTION_KEY}23 - N8N_BASIC_AUTH_ACTIVE=true24 - N8N_BASIC_AUTH_USER=${N8N_BASIC_AUTH_USER}25 - N8N_BASIC_AUTH_PASSWORD=${N8N_BASIC_AUTH_PASSWORD}2627 # Webhook URL for production triggers28 - WEBHOOK_URL=https://n8n.example.com/2930 # Execution settings31 - EXECUTIONS_DATA_PRUNE=true32 - EXECUTIONS_DATA_MAX_AGE=16833 volumes:34 - n8n_data:/home/node/.n8n35 depends_on:36 postgres:37 condition: service_healthy3839 postgres:40 image: postgres:16-alpine41 restart: unless-stopped42 environment:43 - POSTGRES_DB=n8n44 - POSTGRES_USER=n8n45 - POSTGRES_PASSWORD=${POSTGRES_PASSWORD}46 volumes:47 - postgres_data:/var/lib/postgresql/data48 healthcheck:49 test: ['CMD-SHELL', 'pg_isready -U n8n']50 interval: 10s51 timeout: 5s52 retries: 55354volumes:55 n8n_data:56 postgres_data:Common mistakes when setting Concurrency Limits in n8n
Why it's a problem: Setting the concurrency limit on n8n Cloud instead of self-hosted
How to avoid: N8N_CONCURRENCY_PRODUCTION_LIMIT is only available on self-hosted instances. n8n Cloud manages concurrency internally and does not expose this setting.
Why it's a problem: Using the webhook test URL to verify concurrency limits
How to avoid: Concurrency limits only apply to production executions. Use the production webhook URL (/webhook/) when testing, not the test URL (/webhook-test/).
Why it's a problem: Setting the limit too low and causing webhook timeouts
How to avoid: If callers time out waiting for a response, either increase the limit or switch the Webhook response mode to Immediately so the caller gets a 200 right away.
Why it's a problem: Forgetting to restart n8n after changing the environment variable
How to avoid: The concurrency limit is read at startup. Always restart n8n after changing N8N_CONCURRENCY_PRODUCTION_LIMIT.
Best practices
- Always use PostgreSQL as the database backend when enabling concurrency limits — SQLite does not handle concurrent writes reliably
- Start with a concurrency limit of 5 and increase by 2-3 at a time while monitoring server resources
- Use the Immediately response mode on Webhook nodes for endpoints where callers cannot wait in a queue
- Set EXECUTIONS_DATA_PRUNE=true and EXECUTIONS_DATA_MAX_AGE to prevent the executions table from growing unbounded
- Monitor memory usage per execution to calculate the safe maximum for your server's RAM
- Separate CPU-heavy workflows from I/O-heavy ones across different n8n instances if you need different concurrency profiles
- Document your concurrency limit and the reasoning behind the chosen value for your operations team
- Test concurrency behavior with your actual production workflows, not just simple test workflows
Still stuck?
Copy one of these prompts to get a personalized, step-by-step explanation.
I run a self-hosted n8n instance on a 4-core server with 8GB RAM. My workflows are mostly webhook-triggered and call external APIs. Help me determine the right N8N_CONCURRENCY_PRODUCTION_LIMIT value and explain how the FIFO queue works.
Set up a docker-compose.yml for n8n with N8N_CONCURRENCY_PRODUCTION_LIMIT set to 5, PostgreSQL backend, basic auth enabled, and execution data pruning configured. Include health checks and restart policies.
Frequently asked questions
What is the default concurrency limit in n8n?
By default, N8N_CONCURRENCY_PRODUCTION_LIMIT is set to -1, which means there is no limit. All triggered executions run immediately in parallel without queuing.
Does the concurrency limit apply to manual executions?
No, the concurrency limit only applies to production executions triggered by webhooks, cron schedules, and other trigger nodes on active workflows. Manual test executions from the editor are not affected.
What happens to queued executions if n8n restarts?
Queued executions that have not started are lost on restart if using SQLite. With PostgreSQL, pending executions are recovered after restart because execution data is persisted in the database.
Can I set different concurrency limits for different workflows?
No, N8N_CONCURRENCY_PRODUCTION_LIMIT is a global setting that applies to all workflows equally. To achieve per-workflow limits, you would need to run separate n8n instances with different limits.
How does concurrency limiting differ from queue mode?
Concurrency limiting caps parallel executions on a single n8n instance. Queue mode (using Redis and worker processes) distributes executions across multiple n8n workers for horizontal scaling. You can use both together.
Will webhook callers receive an error when executions are queued?
No, the HTTP connection stays open until the execution completes (when using When Last Node Finishes response mode). The caller waits longer but does not get an error. Set the response mode to Immediately if callers cannot wait.
Can I monitor how many executions are currently queued?
Check the Executions panel in the n8n UI to see running and waiting executions. You can also query the n8n API endpoint GET /api/v1/executions to list active executions programmatically.
Can RapidDev help configure concurrency and scaling for my n8n instance?
Yes, RapidDev's engineering team can assess your workflow load, configure optimal concurrency limits, and set up queue mode with Redis workers if you need horizontal scaling for high-throughput automation.
Talk to an Expert
Our team has built 600+ apps. Get personalized help with your project.
Book a free consultation