Production n8n: Queue Workers, Metrics & Monitoring
Running n8n in production isn't just spinning up a container. You need queue workers for reliability, metrics for visibility, and monitoring to catch issues before they become problems.
At Woltex, n8n is used to automate business operations support. But running n8n in production isn't just spinning up a container. You need queue workers for reliability, metrics for visibility, and monitoring to catch issues before they become problems.
Here's our complete production stack, battle-tested and ready to deploy.
Quick Start
Get your production-ready n8n stack running in under 5 minutes:
Create project directory
mkdir n8n-production && cd n8n-productionSet up directory structure
Create the necessary directories for Grafana provisioning:
mkdir -p grafana/provisioning/dashboards
mkdir -p grafana/provisioning/datasources
mkdir -p grafana/dashboardsGenerate secure environment variables
Create a .env file with secure credentials:
# Generate all secrets at once
echo "N8N_ENCRYPTION_KEY=$(openssl rand -hex 32)" >> .env
echo "JWT_SECRET=$(openssl rand -hex 32)" >> .env
echo "POSTGRES_PASSWORD=$(openssl rand -base64 24)" >> .env
echo "GRAFANA_PASSWORD=$(openssl rand -base64 24)" >> .env
echo "N8N_HOST=n8n.yourdomain.com" >> .envKeep these secure!
These secrets are critical for your n8n security. Never commit the .env file
to git.
Add configuration files
You'll need to create three configuration files. Scroll down to copy each one from the sections below:
Find the complete configuration in these sections:
- Infrastructure Setup (docker-compose.yml)
- Metrics Configuration (prometheus.yml)
- Dashboard Setup (Grafana configs)
Launch the stack
docker-compose up -dWait ~30 seconds for all services to initialize, then check the logs:
docker-compose logs -fAccess your services
- n8n UI: http://localhost:5678
- Grafana Dashboard: http://localhost:3000 (use password from
.env) - Prometheus: http://localhost:9090
Why Production n8n Matters
n8n is powerful for workflow automation, but out-of-the-box it's missing production essentials. When a workflow fails at 3 AM, you need to know. When queue depth hits 500, you need visibility. When your automation infrastructure is business-critical, you need monitoring.
| Feature | What You Get |
|---|---|
| Reliable Execution | Queue workers handle workloads without blocking the main instance |
| Full Observability | Prometheus metrics expose every aspect of your n8n instance |
| Ready Dashboard | Pre-built Grafana dashboard with alerts and visualization |
| Horizontal Scaling | Add workers on-demand to handle increased load |
Architecture Overview
Here's how all the pieces fit together:
┌───────────────────────────────────────────────────┐
│ Production Stack │
├───────────────────────────────────────────────────┤
│ │
│ ┌──────────┐ ┌────────────────────────────┐ │
│ │ n8n │────>│ PostgreSQL Database │ │
│ │ Main │ │ (workflows + executions) │ │
│ └──────────┘ └────────────────────────────┘ │
│ │ │
│ v │
│ ┌──────────────────────────────────────────┐ │
│ │ Redis Queue │ │
│ │ (job distribution & coordination) │ │
│ └──────────────────────────────────────────┘ │
│ │ │ │
│ v v │
│ ┌──────────┐ ┌──────────┐ │
│ │ Worker │ │ Worker │ │
│ │ #1 │ │ #2 │ │
│ └──────────┘ └──────────┘ │
│ │ │ │
│ └────────┬───────────┘ │
│ v │
│ ┌───────────────┐ │
│ │ Prometheus │ │
│ │ (metrics) │ │
│ └───────────────┘ │
│ │ │
│ v │
│ ┌───────────────┐ │
│ │ Grafana │ │
│ │ (dashboards) │ │
│ └───────────────┘ │
│ │
└───────────────────────────────────────────────────┘Component roles
- n8n main instance — handles UI, webhooks, and schedules workflows
- Queue workers — execute workflows independently, scale horizontally
- Redis — job queue and coordination between main + workers
- PostgreSQL — stores workflows, credentials, and execution history
- Prometheus — scrapes metrics from n8n, stores time-series data
- Grafana — visualizes everything with pre-built dashboards
Infrastructure Setup
The foundation is a Docker Compose setup that orchestrates n8n with queue workers, Redis, PostgreSQL, and the monitoring stack.
Docker Compose configuration
Save this as docker-compose.yml in your project root:
version: "3.8"
services:
prometheus:
image: prom/prometheus:latest
restart: unless-stopped
volumes:
- ./prometheus.yml:/etc/prometheus/prometheus.yml
- prometheus_data:/prometheus
ports:
- "127.0.0.1:9090:9090" # Only localhost
networks:
- n8n-network
grafana:
image: grafana/grafana:latest
restart: unless-stopped
environment:
GF_SECURITY_ADMIN_PASSWORD: ${GRAFANA_PASSWORD}
GF_PATHS_PROVISIONING: /etc/grafana/provisioning
volumes:
- grafana_data:/var/lib/grafana
- ./grafana/provisioning/datasources:/etc/grafana/provisioning/datasources:ro
- ./grafana/provisioning/dashboards:/etc/grafana/provisioning/dashboards:ro
- ./grafana/dashboards:/var/lib/grafana/dashboards:ro
ports:
- "127.0.0.1:3000:3000"
depends_on:
- prometheus
networks:
- n8n-network
postgres:
image: postgres:16
restart: unless-stopped
environment:
POSTGRES_DB: n8n
POSTGRES_USER: n8n
POSTGRES_PASSWORD: ${POSTGRES_PASSWORD}
volumes:
- postgres_data:/var/lib/postgresql/data
healthcheck:
test: ["CMD-SHELL", "pg_isready -U n8n"]
interval: 10s
timeout: 5s
retries: 5
deploy:
resources:
limits:
cpus: "1"
memory: 1G
networks:
- n8n-network
redis:
image: redis:7-alpine
restart: unless-stopped
command: redis-server --appendonly yes --maxmemory 512mb --maxmemory-policy allkeys-lru
volumes:
- redis_data:/data
healthcheck:
test: ["CMD", "redis-cli", "ping"]
interval: 10s
timeout: 5s
retries: 5
networks:
- n8n-network
n8n:
image: n8nio/n8n:latest
restart: unless-stopped
ports:
- "5678:5678"
environment:
# Database Configuration
DB_TYPE: postgresdb
DB_POSTGRESDB_HOST: postgres
DB_POSTGRESDB_PORT: 5432
DB_POSTGRESDB_DATABASE: n8n
DB_POSTGRESDB_USER: n8n
DB_POSTGRESDB_PASSWORD: ${POSTGRES_PASSWORD}
# n8n Host Configuration
N8N_HOST: ${N8N_HOST}
N8N_PROTOCOL: https
N8N_PORT: 5678
WEBHOOK_URL: https://${N8N_HOST}/
# Security
N8N_ENCRYPTION_KEY: ${N8N_ENCRYPTION_KEY}
N8N_USER_MANAGEMENT_JWT_SECRET: ${JWT_SECRET}
# Queue Mode Configuration (CRITICAL)
EXECUTIONS_MODE: queue
QUEUE_BULL_REDIS_HOST: redis
QUEUE_BULL_REDIS_PORT: 6379
QUEUE_BULL_REDIS_DB: 0
# Worker Health Check
QUEUE_HEALTH_CHECK_ACTIVE: "true"
# Execution Data Management
EXECUTIONS_DATA_SAVE_ON_ERROR: all
EXECUTIONS_DATA_SAVE_ON_SUCCESS: all
EXECUTIONS_DATA_SAVE_ON_PROGRESS: "true"
EXECUTIONS_DATA_SAVE_MANUAL_EXECUTIONS: "true"
EXECUTIONS_DATA_PRUNE: "true"
EXECUTIONS_DATA_MAX_AGE: 336 # 14 days in hours
# Binary Data Storage (IMPORTANT for queue mode)
N8N_DEFAULT_BINARY_DATA_MODE: filesystem
# Concurrency for Main Process
N8N_CONCURRENCY_PRODUCTION_LIMIT: 3
# Metrics & Monitoring
N8N_METRICS: "true"
N8N_METRICS_INCLUDE_WORKFLOW_ID_LABEL: "true"
N8N_METRICS_INCLUDE_NODE_TYPE_LABEL: "true"
N8N_METRICS_INCLUDE_CREDENTIAL_TYPE_LABEL: "true"
# Logging
N8N_LOG_LEVEL: info
N8N_LOG_OUTPUT: console,file
# Timezone
GENERIC_TIMEZONE: Europe/London
TZ: Europe/London
volumes:
- n8n_data:/home/node/.n8n
- n8n_files:/files
depends_on:
postgres:
condition: service_healthy
redis:
condition: service_healthy
healthcheck:
test:
[
"CMD-SHELL",
"wget --no-verbose --tries=1 --spider http://localhost:5678/healthz || exit 1",
]
interval: 30s
timeout: 10s
retries: 3
start_period: 40s
networks:
- n8n-network
# Worker - Handles workflow executions from queue
n8n-worker:
image: n8nio/n8n:latest
restart: unless-stopped
command: worker
environment:
# Database Configuration
DB_TYPE: postgresdb
DB_POSTGRESDB_HOST: postgres
DB_POSTGRESDB_PORT: 5432
DB_POSTGRESDB_DATABASE: n8n
DB_POSTGRESDB_USER: n8n
DB_POSTGRESDB_PASSWORD: ${POSTGRES_PASSWORD}
# Queue Mode Configuration
EXECUTIONS_MODE: queue
QUEUE_BULL_REDIS_HOST: redis
QUEUE_BULL_REDIS_PORT: 6379
QUEUE_BULL_REDIS_DB: 0
# Worker Health Check
QUEUE_HEALTH_CHECK_ACTIVE: "true"
# Security
N8N_ENCRYPTION_KEY: ${N8N_ENCRYPTION_KEY}
# Worker Concurrency
N8N_CONCURRENCY_PRODUCTION_LIMIT: 10
# Binary Data Storage
N8N_DEFAULT_BINARY_DATA_MODE: filesystem
# Logging
N8N_LOG_LEVEL: info
N8N_LOG_OUTPUT: console
# Timezone
GENERIC_TIMEZONE: Europe/London
TZ: Europe/London
volumes:
- n8n_data:/home/node/.n8n
- n8n_files:/files
depends_on:
postgres:
condition: service_healthy
redis:
condition: service_healthy
n8n:
condition: service_healthy
deploy:
replicas: 2 # Start with 2 workers
networks:
- n8n-network
volumes:
postgres_data:
redis_data:
n8n_data:
n8n_files:
prometheus_data:
grafana_data:
networks:
n8n-network:
driver: bridgeMetrics Configuration
Prometheus needs to know where to scrape metrics. n8n exposes metrics at /metrics when N8N_METRICS=true is configured.
Prometheus setup
Save this as prometheus.yml in your project root:
global:
scrape_interval: 15s
scrape_configs:
- job_name: "n8n"
static_configs:
- targets: ["n8n:5678"]
metrics_path: "/metrics"Dashboard Setup
Grafana provisioning means dashboards and data sources are automatically loaded when Grafana starts. No manual clicking through the UI.
Save as grafana/provisioning/datasources/prometheus.yml:
apiVersion: 1
datasources:
- name: Prometheus
type: prometheus
access: proxy
url: http://prometheus:9090
isDefault: true
editable: falseThis automatically connects Grafana to your Prometheus instance.
Save as grafana/provisioning/dashboards/dashboards.yml:
apiVersion: 1
providers:
- name: "default"
orgId: 1
folder: ""
type: file
disableDeletion: false
editable: true
options:
path: /var/lib/grafana/dashboardsThis tells Grafana where to find dashboard JSON files.
Production-ready dashboard
Save your dashboard as grafana/dashboards/n8n.json. This gives you a complete production-ready dashboard with real-time monitoring.
Dashboard Best Practice
Export dashboards from Grafana as JSON, commit them to git, and they'll load automatically on every new deployment. Version control for your monitoring setup!
Dashboard Insights
Once your Grafana dashboard is running, you'll have comprehensive visibility into your n8n operations:
Performance Overview
- Total executions per hour/day/week
- Success vs failure rate (%)
- Average execution time by workflow
- Slowest workflows (P95, P99 latency)
Workflow Analysis
- Most frequently executed workflows
- Workflows with the highest failure rate
- Execution trends over time
Job Processing
- Active jobs currently in process
- Waiting jobs (queue backlog)
- Processing rate (jobs/minute)
- Average job wait time
Worker Performance
- Worker utilization (%)
- Jobs per worker
- Worker availability status
If queue depth consistently exceeds 100, consider scaling your workers horizontally.
Memory & CPU
- Heap memory usage (current/max)
- External memory allocation
- CPU utilization per container
- Event loop lag (detect blocking operations)
Dependencies
- PostgreSQL connection pool status
- Redis memory usage and connections
- Database query performance
Node.js Internals
- Garbage collection frequency and duration
- V8 heap statistics
- Process uptime and restarts
Production Operations
Environment variables best practices
Security First
Never hardcode credentials in docker-compose.yml. Always use environment
variables from a .env file, and add .env to your .gitignore.
Your .env file should contain all sensitive configuration:
# Security keys (generate with: openssl rand -hex 32)
N8N_ENCRYPTION_KEY=your-64-char-hex-key-here
JWT_SECRET=your-64-char-hex-key-here
# Database password (generate with: openssl rand -base64 24)
POSTGRES_PASSWORD=your-secure-db-password
# Domain configuration
N8N_HOST=n8n.yourdomain.com
# Grafana admin password (generate with: openssl rand -base64 24)
GRAFANA_PASSWORD=your-grafana-passwordQuick secret generation:
echo "N8N_ENCRYPTION_KEY=$(openssl rand -hex 32)" >> .env
echo "JWT_SECRET=$(openssl rand -hex 32)" >> .env
echo "POSTGRES_PASSWORD=$(openssl rand -base64 24)" >> .env
echo "GRAFANA_PASSWORD=$(openssl rand -base64 24)" >> .env
echo "N8N_HOST=n8n.yourdomain.com" >> .envScaling workers
Start with 2 workers and monitor queue depth in Grafana. When jobs start backing up, scale horizontally:
docker-compose up -d --scale n8n-worker=4Worker Scaling Strategy
- Queue depth < 10: Current workers are handling load well
- Queue depth 10-50: Monitor closely, consider scaling soon
- Queue depth > 50: Scale workers immediately
- Queue depth > 100: Critical - scale urgently and investigate bottlenecks
Alerting configuration
Configure Grafana alerts for critical conditions to catch issues before they impact users:
Backup strategy
Your workflows and execution history live in PostgreSQL. Implement automated backups:
# Add to crontab: crontab -e
0 2 * * * docker exec n8n-postgres pg_dump -U n8n n8n > /backups/n8n-$(date +\%Y\%m\%d).sqlSecuring access
Never Expose Directly
Don't expose ports directly to the internet. Always use a reverse proxy with TLS termination.
Recommended Setup Options:
Zero-config secure access without opening ports:
# Install cloudflared
curl -L https://github.com/cloudflare/cloudflared/releases/latest/download/cloudflared-linux-amd64 -o /usr/local/bin/cloudflared
chmod +x /usr/local/bin/cloudflared
# Create tunnel
cloudflared tunnel create n8n-tunnel
# Configure tunnel
cloudflared tunnel route dns n8n-tunnel n8n.yourdomain.com
# Run tunnel
cloudflared tunnel --url http://localhost:5678 run n8n-tunnelUse Nginx with Let's Encrypt for TLS:
server {
listen 443 ssl http2;
server_name n8n.yourdomain.com;
ssl_certificate /etc/letsencrypt/live/n8n.yourdomain.com/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/n8n.yourdomain.com/privkey.pem;
location / {
proxy_pass http://localhost:5678;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
# WebSocket support
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
}
}Add Traefik service to your docker-compose.yml:
traefik:
image: traefik:v2.10
command:
- "--providers.docker=true"
- "--entrypoints.web.address=:80"
- "--entrypoints.websecure.address=:443"
- "--certificatesresolvers.myresolver.acme.tlschallenge=true"
- "--certificatesresolvers.myresolver.acme.email=your@email.com"
ports:
- "80:80"
- "443:443"
volumes:
- /var/run/docker.sock:/var/run/docker.sock:ro
- traefik_certs:/letsencrypt
networks:
- n8n-networkThen add labels to your n8n service:
labels:
- "traefik.enable=true"
- "traefik.http.routers.n8n.rule=Host(`n8n.yourdomain.com`)"
- "traefik.http.routers.n8n.entrypoints=websecure"
- "traefik.http.routers.n8n.tls.certresolver=myresolver"Additional Resources
Want to dive deeper into specific topics? Check out these official documentation resources:
n8n Queue Mode
Deep dive into queue mode architecture, scaling strategies, and configuration options
Prometheus Configuration
Learn about advanced Prometheus setup, recording rules, and alerting configurations
Grafana Provisioning
Master Grafana's provisioning system for automated dashboard and datasource management
Start a Fumadocs Blog in 10 Minutes
Building in public? Sometimes you need more than X posts to document your journey; or to create comprehensive product documentation for your users. Here's how to set up a beautiful, fast documentation or blogging site with Fumadocs.
Zero Open Ports: Secure Your VPS in 15 Minutes
Your VPS has ports exposed to the internet right now. Here's how to close every single port and still access everything using Cloudflare Tunnel.
