12 KiB
Monitoring & Logging Setup
Overview
This document provides comprehensive monitoring and logging guidelines for the n8n AI Support Automation system. It includes key metrics, troubleshooting procedures, and log inspection commands.
Key Metrics
1. Mail Processing Rate (Workflow A)
Description: Track the number of conversations processed through the system.
N8N Logs:
docker-compose logs -f n8n | grep "processed"
PostgreSQL Query:
SELECT COUNT(*) as total_executions,
COUNT(CASE WHEN status = 'success' THEN 1 END) as successful_executions,
ROUND(100.0 * COUNT(CASE WHEN status = 'success' THEN 1 END) / COUNT(*), 2) as success_rate
FROM workflow_executions
WHERE workflow_name = 'workflow-a';
Expected Behavior:
- Consistent processing rate (depends on Freescout mail polling interval)
- Success rate > 95%
- Monitor for sudden drops in processing rate
2. Approval Rate (Workflow B)
Description: Monitor the ratio of approved vs rejected KB updates from the AI suggestions.
PostgreSQL Query:
SELECT status, COUNT(*) as count,
ROUND(100.0 * COUNT(*) / SUM(COUNT(*)) OVER (), 2) as percentage
FROM knowledge_base_updates
GROUP BY status
ORDER BY count DESC;
Alternative Query for detailed breakdown:
SELECT
status,
COUNT(*) as count,
AVG(EXTRACT(EPOCH FROM (updated_at - created_at))) as avg_approval_time_seconds
FROM knowledge_base_updates
GROUP BY status;
Expected Behavior:
- Majority of updates should be APPROVED (typically 70-90%)
- REJECTED rate should be < 15%
- PENDING updates should be resolved within 24 hours
3. KB Growth (Workflow C)
Description: Track the growth of the knowledge base as new information is added.
Milvus Query:
# First, connect to Milvus
docker-compose exec milvus python3 -c "
from pymilvus import connections, Collection
connections.connect('default', host='localhost', port=19530)
collection = Collection('knowledge_base')
print(f'Total vectors: {collection.num_entities}')
"
PostgreSQL Query for tracking:
SELECT COUNT(*) as total_entries,
COUNT(DISTINCT source) as unique_sources,
MAX(created_at) as latest_entry
FROM knowledge_base
WHERE status = 'approved';
Daily Growth Query:
SELECT DATE(created_at) as date, COUNT(*) as entries_added
FROM knowledge_base
WHERE status = 'approved'
GROUP BY DATE(created_at)
ORDER BY date DESC
LIMIT 30;
Expected Behavior:
- +1 vector per approved ticket (approximately)
- Steady growth correlates with approved KB updates
- Monitor for stalled growth (may indicate Milvus issues)
4. Error Rate
Description: Monitor workflow execution errors across all workflows.
PostgreSQL Query - Overall Error Rate:
SELECT
COUNT(*) as total_executions,
COUNT(CASE WHEN status = 'ERROR' THEN 1 END) as error_count,
ROUND(100.0 * COUNT(CASE WHEN status = 'ERROR' THEN 1 END) / COUNT(*), 2) as error_percentage
FROM workflow_executions;
Detailed Error Analysis:
SELECT
workflow_name,
status,
COUNT(*) as count,
ROUND(100.0 * COUNT(*) / SUM(COUNT(*)) OVER (PARTITION BY workflow_name), 2) as percentage
FROM workflow_executions
GROUP BY workflow_name, status
ORDER BY workflow_name, error_count DESC;
Error Details for Investigation:
SELECT
workflow_name,
status,
error_message,
COUNT(*) as occurrences,
MAX(executed_at) as latest_error
FROM workflow_executions
WHERE status = 'ERROR'
GROUP BY workflow_name, status, error_message
ORDER BY occurrences DESC;
Expected Behavior:
- Error rate < 5%
- No recurring errors (indicates systemic issue)
- Quick recovery from transient errors
Troubleshooting Guide
Workflow A (Mail Processing) - Not Running
Symptoms:
- No new conversations being processed
- N8N logs show no activity
- PostgreSQL query returns unchanged row count
Troubleshooting Steps:
-
Check if workflow trigger is active:
docker-compose logs -f n8n | grep "workflow-a" -
Verify Cron trigger configuration:
- Log into n8n UI at
https://<SUBDOMAIN>.<DOMAIN> - Navigate to workflow-a
- Check cron expression (typically:
0 */5 * * * *for every 5 minutes) - Verify "Active" toggle is ON
- Log into n8n UI at
-
Test Freescout API credentials:
docker-compose exec n8n curl -X GET \ -H "Authorization: Bearer ${FREESCOUT_API_TOKEN}" \ https://<freescout-instance>/api/v1/conversations -
Check Freescout API reachability:
docker-compose exec n8n ping <freescout-instance> docker-compose exec n8n curl -I https://<freescout-instance>/api/v1/health -
Review n8n logs for errors:
docker-compose logs n8n | grep -i "error\|exception" | tail -20 -
Verify PostgreSQL connection:
docker-compose logs n8n | grep -i "database\|postgres"
Workflow B (AI Suggestions) - Not Triggering
Symptoms:
- No new AI suggestions in Freescout
- workflow_executions table shows no recent B entries
- knowledge_base_updates status stuck in PENDING
Troubleshooting Steps:
-
Check if Freescout custom field is being updated:
SELECT * FROM freescout_conversation_custom_fields WHERE field_name = 'AI_SUGGESTION_STATUS' ORDER BY updated_at DESC LIMIT 10; -
Verify polling interval:
- Check n8n workflow B settings
- Polling trigger should be running (typically every 1 minute)
- Confirm:
docker-compose logs n8n | grep -i "polling\|workflow-b"
-
Check webhook configuration:
# If using webhook instead of polling docker-compose logs -f n8n | grep -i "webhook" -
Review Freescout API response:
docker-compose exec postgres psql -U kb_user -d n8n_kb -c \ "SELECT * FROM api_logs WHERE endpoint LIKE '%conversation%' ORDER BY timestamp DESC LIMIT 5;" -
Verify OpenAI/AI provider connectivity:
docker-compose logs n8n | grep -i "openai\|api\|llm" | tail -20 -
Check if there are unprocessed conversations:
SELECT COUNT(*) as pending_conversations FROM workflow_executions WHERE workflow_name = 'workflow-a' AND status = 'success' AND ai_suggestion_generated = false AND created_at > NOW() - INTERVAL '1 hour';
Workflow C (KB Storage) - Not Saving to Milvus
Symptoms:
- knowledge_base table updates but Milvus count doesn't increase
- KB search returns no results
- Milvus health check failures
Troubleshooting Steps:
-
Check Milvus health status:
docker-compose exec milvus curl -s http://localhost:9091/healthz | jq . -
Verify Milvus is running:
docker-compose ps milvus docker-compose logs milvus | tail -30 -
Check if embeddings are being generated:
SELECT COUNT(*) as embeddings_generated FROM knowledge_base WHERE embedding IS NOT NULL; -
Verify Milvus connection in n8n logs:
docker-compose logs n8n | grep -i "milvus\|embedding" | tail -20 -
Test Milvus directly:
docker-compose exec milvus python3 << 'EOF' from pymilvus import connections, Collection connections.connect('default', host='localhost', port=19530) try: collection = Collection('knowledge_base') print(f'✓ Milvus connected, collection entities: {collection.num_entities}') except Exception as e: print(f'✗ Milvus error: {e}') EOF -
Check for rate limiting or connection timeouts:
docker-compose logs n8n | grep -i "timeout\|connection\|refused" | tail -20 -
Verify vector dimension matches:
- Check embedding model (should match Milvus collection definition)
- Default: 1536 dimensions (OpenAI embeddings)
SELECT vector_dimension FROM milvus_schema WHERE collection_name = 'knowledge_base';
Logs & Debugging Commands
View Real-time Logs
N8N Logs:
# All n8n logs
docker-compose logs -f n8n
# Follow specific keywords
docker-compose logs -f n8n | grep -i "error\|workflow\|processed"
# Last 100 lines
docker-compose logs --tail 100 n8n
PostgreSQL Logs:
# View recent PostgreSQL operations
docker-compose logs -f postgres
# Check database activity
docker-compose exec postgres psql -U kb_user -d n8n_kb -c \
"SELECT now(), datname, usename, state FROM pg_stat_activity;"
Milvus Logs:
# View Milvus startup and operation logs
docker-compose logs -f milvus
# Check Milvus status
docker-compose exec milvus curl -s http://localhost:9091/healthz
Database Inspection
Recent Workflow Executions:
docker-compose exec postgres psql -U kb_user -d n8n_kb -c \
"SELECT workflow_name, status, executed_at, error_message FROM workflow_executions ORDER BY executed_at DESC LIMIT 10;"
KB Updates Status:
docker-compose exec postgres psql -U kb_user -d n8n_kb -c \
"SELECT status, COUNT(*) FROM knowledge_base_updates GROUP BY status;"
Last 24h Activity:
docker-compose exec postgres psql -U kb_user -d n8n_kb -c \
"SELECT DATE(executed_at) as date, workflow_name, status, COUNT(*) as count
FROM workflow_executions
WHERE executed_at > NOW() - INTERVAL '24 hours'
GROUP BY DATE(executed_at), workflow_name, status
ORDER BY date DESC, workflow_name;"
Performance Monitoring
PostgreSQL Connection Count:
docker-compose exec postgres psql -U kb_user -d n8n_kb -c \
"SELECT count(*) as connections FROM pg_stat_activity;"
PostgreSQL Cache Hit Ratio:
docker-compose exec postgres psql -U kb_user -d n8n_kb -c \
"SELECT sum(heap_blks_hit) / (sum(heap_blks_hit) + sum(heap_blks_read)) as ratio
FROM pg_statio_user_tables;"
Disk Usage:
docker-compose exec postgres psql -U kb_user -d n8n_kb -c \
"SELECT schemaname, tablename, pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename))
FROM pg_tables
WHERE schemaname NOT IN ('pg_catalog', 'information_schema')
ORDER BY pg_total_relation_size(schemaname||'.'||tablename) DESC;"
Debugging Network Issues
Test connectivity between services:
# From n8n to PostgreSQL
docker-compose exec n8n ping postgres
# From n8n to Milvus
docker-compose exec n8n curl -v http://milvus:19530/api/v1/health
# From n8n to Freescout
docker-compose exec n8n ping <freescout-host>
Alert Thresholds
Configure monitoring/alerting for these conditions:
| Metric | Threshold | Action |
|---|---|---|
| Error Rate | > 5% | Page on-call, review workflow logs |
| KB Growth Stalled | 0 entries in 4 hours | Check Milvus health and embeddings |
| Approval Rate | < 50% | Review AI suggestion quality |
| Processing Rate | Drop > 50% | Check Freescout connection |
| Milvus Health | Not healthy | Restart Milvus, check etcd/minio |
| PostgreSQL Connections | > 80% of max | Investigate connection leaks |
Regular Maintenance
Daily
- Check error rate < 5%
- Verify KB growth is progressing
- Review Freescout API response times
Weekly
- Analyze approval rate trends
- Check PostgreSQL disk usage
- Review n8n workflow performance
Monthly
- Full system health audit
- Database maintenance (VACUUM, ANALYZE)
- Log rotation verification
- Capacity planning review
Version Information
- n8n: Latest from
docker.n8n.io/n8nio/n8n - PostgreSQL: 15-alpine
- Milvus: v2.4.0
- Logging Driver: json-file with max 100MB per file, 10 files rotation
Contact & Escalation
For issues not resolved by this guide:
- Collect logs:
docker-compose logs > system_logs.txt - Export database state for analysis
- Contact DevOps team with reproducible steps