Production n8n: Queue Workers, Metrics & Monitoring
Running n8n in production isn't just spinning up a container. You need queue workers for reliability, metrics for visibility, and monitoring to catch issues before they become problems.
At Woltex, we run n8n to automate our business operations. But running n8n in production isn't just spinning up a container. You need queue workers for reliability, metrics for visibility, and monitoring to catch issues before they become problems. Here's our complete production stack.
Why We Built This
n8n is powerful for workflow automation, but out-of-the-box it's missing production essentials. When a workflow fails at 3 AM, you need to know. When queue depth hits 500, you need visibility. When your automation infrastructure is business-critical, you need monitoring.
| Feature | Description |
|---|---|
| Reliable Execution | Queue workers handle workloads without blocking the main instance |
| Full Observability | Prometheus metrics expose every aspect of your n8n instance |
| Ready Dashboard | Pre-built Grafana dashboard with alerts and visualization |
The Stack
┌───────────────────────────────────────────────────┐
│ Production Stack │
├───────────────────────────────────────────────────┤
│ │
│ ┌──────────┐ ┌────────────────────────────┐ │
│ │ n8n │────>│ PostgreSQL Database │ │
│ │ Main │ │ (workflows + executions) │ │
│ └──────────┘ └────────────────────────────┘ │
│ │ │
│ v │
│ ┌──────────────────────────────────────────┐ │
│ │ Redis Queue │ │
│ │ (job distribution & coordination) │ │
│ └──────────────────────────────────────────┘ │
│ │ │ │
│ v v │
│ ┌──────────┐ ┌──────────┐ │
│ │ Worker │ │ Worker │ │
│ │ #1 │ │ #2 │ │
│ └──────────┘ └──────────┘ │
│ │ │ │
│ └────────┬───────────┘ │
│ v │
│ ┌───────────────┐ │
│ │ Prometheus │ │
│ │ (metrics) │ │
│ └───────────────┘ │
│ │ │
│ v │
│ ┌───────────────┐ │
│ │ Grafana │ │
│ │ (dashboards) │ │
│ └───────────────┘ │
│ │
└───────────────────────────────────────────────────┘- n8n main instance: handles UI, webhooks, and schedules workflows
- Queue workers: execute workflows independently, scale horizontally
- Redis: job queue and coordination between main + workers
- Prometheus: scrapes metrics from n8n, stores time-series data
- Grafana: visualizes everything with pre-built dashboards
Step 1: Core Infrastructure
The foundation is a Docker Compose setup that orchestrates n8n with queue workers, Redis, PostgreSQL, and the monitoring stack.
version: "3.8"
services:
prometheus:
image: prom/prometheus:latest
restart: unless-stopped
volumes:
- ./prometheus.yml:/etc/prometheus/prometheus.yml
- prometheus_data:/prometheus
ports:
- "127.0.0.1:9090:9090" # Only localhost
networks:
- n8n-network
grafana:
image: grafana/grafana:latest
restart: unless-stopped
environment:
GF_SECURITY_ADMIN_PASSWORD: ${GRAFANA_PASSWORD}
GF_PATHS_PROVISIONING: /etc/grafana/provisioning
volumes:
- grafana_data:/var/lib/grafana
- ./grafana/provisioning/datasources:/etc/grafana/provisioning/datasources:ro
- ./grafana/provisioning/dashboards:/etc/grafana/provisioning/dashboards:ro
- ./grafana/dashboards:/var/lib/grafana/dashboards:ro
ports:
- "127.0.0.1:3000:3000"
depends_on:
- prometheus
networks:
- n8n-network
postgres:
image: postgres:16
restart: unless-stopped
environment:
POSTGRES_DB: n8n
POSTGRES_USER: n8n
POSTGRES_PASSWORD: ${POSTGRES_PASSWORD}
volumes:
- postgres_data:/var/lib/postgresql/data
healthcheck:
test: ["CMD-SHELL", "pg_isready -U n8n"]
interval: 10s
timeout: 5s
retries: 5
deploy:
resources:
limits:
cpus: "1"
memory: 1G
networks:
- n8n-network
redis:
image: redis:7-alpine
restart: unless-stopped
command: redis-server --appendonly yes --maxmemory 512mb --maxmemory-policy allkeys-lru
volumes:
- redis_data:/data
healthcheck:
test: ["CMD", "redis-cli", "ping"]
interval: 10s
timeout: 5s
retries: 5
networks:
- n8n-network
n8n:
image: n8nio/n8n:latest
restart: unless-stopped
ports:
- "5678:5678"
environment:
# Database Configuration
DB_TYPE: postgresdb
DB_POSTGRESDB_HOST: postgres
DB_POSTGRESDB_PORT: 5432
DB_POSTGRESDB_DATABASE: n8n
DB_POSTGRESDB_USER: n8n
DB_POSTGRESDB_PASSWORD: ${POSTGRES_PASSWORD}
# n8n Host Configuration
N8N_HOST: ${N8N_HOST}
N8N_PROTOCOL: https
N8N_PORT: 5678
WEBHOOK_URL: https://${N8N_HOST}/
# Security
N8N_ENCRYPTION_KEY: ${N8N_ENCRYPTION_KEY}
N8N_USER_MANAGEMENT_JWT_SECRET: ${JWT_SECRET}
# Queue Mode Configuration (CRITICAL)
EXECUTIONS_MODE: queue
QUEUE_BULL_REDIS_HOST: redis
QUEUE_BULL_REDIS_PORT: 6379
QUEUE_BULL_REDIS_DB: 0
# Worker Health Check
QUEUE_HEALTH_CHECK_ACTIVE: "true"
# Execution Data Management
EXECUTIONS_DATA_SAVE_ON_ERROR: all
EXECUTIONS_DATA_SAVE_ON_SUCCESS: all
EXECUTIONS_DATA_SAVE_ON_PROGRESS: "true"
EXECUTIONS_DATA_SAVE_MANUAL_EXECUTIONS: "true"
EXECUTIONS_DATA_PRUNE: "true"
EXECUTIONS_DATA_MAX_AGE: 336 # 14 days in hours
# Binary Data Storage (IMPORTANT for queue mode)
N8N_DEFAULT_BINARY_DATA_MODE: filesystem
# Concurrency for Main Process
N8N_CONCURRENCY_PRODUCTION_LIMIT: 3
# Metrics & Monitoring
N8N_METRICS: "true"
N8N_METRICS_INCLUDE_WORKFLOW_ID_LABEL: "true"
N8N_METRICS_INCLUDE_NODE_TYPE_LABEL: "true"
N8N_METRICS_INCLUDE_CREDENTIAL_TYPE_LABEL: "true"
# Logging
N8N_LOG_LEVEL: info
N8N_LOG_OUTPUT: console,file
# Timezone
GENERIC_TIMEZONE: Europe/London
TZ: Europe/London
volumes:
- n8n_data:/home/node/.n8n
- n8n_files:/files
depends_on:
postgres:
condition: service_healthy
redis:
condition: service_healthy
healthcheck:
test:
[
"CMD-SHELL",
"wget --no-verbose --tries=1 --spider http://localhost:5678/healthz || exit 1",
]
interval: 30s
timeout: 10s
retries: 3
start_period: 40s
networks:
- n8n-network
# Worker - Handles workflow executions from queue
n8n-worker:
image: n8nio/n8n:latest
restart: unless-stopped
command: worker
environment:
# Database Configuration
DB_TYPE: postgresdb
DB_POSTGRESDB_HOST: postgres
DB_POSTGRESDB_PORT: 5432
DB_POSTGRESDB_DATABASE: n8n
DB_POSTGRESDB_USER: n8n
DB_POSTGRESDB_PASSWORD: ${POSTGRES_PASSWORD}
# Queue Mode Configuration
EXECUTIONS_MODE: queue
QUEUE_BULL_REDIS_HOST: redis
QUEUE_BULL_REDIS_PORT: 6379
QUEUE_BULL_REDIS_DB: 0
# Worker Health Check
QUEUE_HEALTH_CHECK_ACTIVE: "true"
# Security
N8N_ENCRYPTION_KEY: ${N8N_ENCRYPTION_KEY}
# Worker Concurrency
N8N_CONCURRENCY_PRODUCTION_LIMIT: 10
# Binary Data Storage
N8N_DEFAULT_BINARY_DATA_MODE: filesystem
# Logging
N8N_LOG_LEVEL: info
N8N_LOG_OUTPUT: console
# Timezone
GENERIC_TIMEZONE: Europe/London
TZ: Europe/London
volumes:
- n8n_data:/home/node/.n8n
- n8n_files:/files
depends_on:
postgres:
condition: service_healthy
redis:
condition: service_healthy
n8n:
condition: service_healthy
deploy:
replicas: 2 # Start with 2 workers
networks:
- n8n-network
volumes:
postgres_data:
redis_data:
n8n_data:
n8n_files:
prometheus_data:
grafana_data:
networks:
n8n-network:
driver: bridgeKey points
- Workers run the same n8n image, but with
EXECUTIONS_MODE=queue- Redis handles job distribution -QUEUE_BULL_REDIS_HOST- Prometheus and Grafana are pre-configured with volumes for persistence - Environment variables configure queue mode and metrics endpoints
Step 2: Prometheus Scraping Configuration
Prometheus needs to know where to scrape metrics. n8n exposes metrics at /metrics when configured properly.
Save as prometheus.yml:
global:
scrape_interval: 15s
scrape_configs:
- job_name: "n8n"
static_configs:
- targets: ["n8n:5678"]
metrics_path: "/metrics"What gets collected:
- Workflow execution counts and duration
- Queue depth and processing rate
- Node.js memory and CPU metrics
- HTTP request latency and throughput
- Database connection pool stats
Step 3: Grafana Auto-Provisioning
Grafana provisioning means dashboards and data sources are automatically loaded when Grafana starts. No manual clicking through the UI.
Dashboard Configuration
Save as grafana/provisioning/dashboards/dashboards.yml:
apiVersion: 1
providers:
- name: "default"
orgId: 1
folder: ""
type: file
disableDeletion: false
editable: true
options:
path: /var/lib/grafana/dashboardsData Source Configuration
Save as grafana/provisioning/datasources/prometheus.yml:
apiVersion: 1
datasources:
- name: Prometheus
type: prometheus
access: proxy
url: http://prometheus:9090
isDefault: true
editable: falsePre-Built Dashboard
Save as grafana/dashboards/n8n.json - this gives you a complete production-ready dashboard with real-time monitoring of n8n performance, memory usage, event loop lag, garbage collection, and more.
Dashboard includes: Active workflows, leader status, version info, memory/CPU usage, event loop lag (P90/P99), garbage collection metrics, heap space usage, and process uptime.
Pro tip: Export dashboards from Grafana as JSON, commit them to git, and they'll load automatically on every new deployment.
What You'll See in the Dashboard
Execution Metrics
- Total executions per hour/day/week
- Success vs failure rate
- Average execution time by workflow
- Slowest workflows (P95, P99 latency)
Queue Health
- Active jobs in queue
- Waiting jobs (backlog)
- Processing rate (jobs/minute)
- Worker utilization
System Resources
- Memory usage (heap + external)
- CPU utilization per container
- Database connection pool
- Redis memory and connections
Running It in Production
1. Environment Variables
Don't hardcode credentials in docker-compose.yml. Create a .env file:
# Generate secure keys with: openssl rand -hex 32
N8N_ENCRYPTION_KEY=your-64-char-hex-key-here
JWT_SECRET=your-64-char-hex-key-here
# Database password (generate with: openssl rand -base64 24)
POSTGRES_PASSWORD=your-secure-db-password
# Domain configuration
N8N_HOST=n8n.yourdomain.com
# Grafana admin password (generate with: openssl rand -base64 24)
GRAFANA_PASSWORD=your-grafana-passwordQuick setup - generate all secrets at once:
echo "N8N_ENCRYPTION_KEY=$(openssl rand -hex 32)" >> .env
echo "JWT_SECRET=$(openssl rand -hex 32)" >> .env
echo "POSTGRES_PASSWORD=$(openssl rand -base64 24)" >> .env
echo "GRAFANA_PASSWORD=$(openssl rand -base64 24)" >> .env
echo "N8N_HOST=n8n.yourdomain.com" >> .env2. Scaling Workers
Start with 2 workers. Monitor queue depth in Grafana. If jobs back up, scale horizontally:
docker-compose up -d --scale n8n-worker=43. Alerting
Configure Grafana alerts for critical conditions:
- Queue depth > 100 for 5 minutes
- Workflow failure rate > 10%
- Memory usage > 90%
- Worker down (no metrics scraped)
4. Backups
Your workflows and execution history live in PostgreSQL. Automate backups:
# Daily backup cron
0 2 * * * docker exec n8n-postgres pg_dump -U n8n n8n > /backups/n8n-$(date +%Y%m%d).sql5. Securing Access
Don't expose ports directly to the internet. Put everything behind a reverse proxy (nginx, Traefik) with TLS, or use a tunnel solution like Cloudflare Tunnel.
Get Started
- Create a directory:
mkdir n8n-production && cd n8n-production - Create and paste the
docker-compose.yml(from Step 1 above) - Create the Prometheus config directory and paste
prometheus.yml(from Step 2) - Set up Grafana provisioning structure and paste the config files (from Step 3):
mkdir -p grafana/provisioning/dashboards mkdir -p grafana/provisioning/datasources mkdir -p grafana/dashboards - Generate your
.envfile with secrets - Start the stack:
docker-compose up -d - Access Grafana at
http://localhost:3000 - Access n8n at
http://localhost:5678
First time?
Give it 30 seconds to fully start up. Check logs with docker-compose logs -f
if something doesn't load.
Learn More
💡 This is how we run n8n at Woltex. Full visibility, reliable execution, zero surprises. Copy the configs, adjust for your needs, and ship it.
