Skip to main content

Monitoring Guide

Version: v0.99.0-beta


Overview

Neural Commander provides built-in monitoring through:

  1. Health endpoints - HTTP health checks
  2. Daemon stats - Resource usage metrics
  3. Logs - Structured logging
  4. Admin console - Real-time TUI dashboard

Health Checks

Basic Health

curl http://localhost:7669/health
# Returns: {"status":"ok","version":"v0.99.0-beta"}

Detailed Health

curl http://localhost:7669/api/health
# Returns detailed component status

Health Check Script

#!/bin/bash
# nc-healthcheck.sh - Use with monitoring systems

RESPONSE=$(curl -s -o /dev/null -w "%{http_code}" http://localhost:7669/health)

if [ "$RESPONSE" = "200" ]; then
echo "OK"
exit 0
else
echo "CRITICAL - NC API not responding (HTTP $RESPONSE)"
exit 2
fi

Metrics

Daemon Statistics

ncmd daemon stats

Output:

Neural Commander Daemon Statistics
──────────────────────────────────
Uptime: 4h 23m
Sessions Active: 3
Events Processed: 1,247
API Requests: 892
Cache Hit Rate: 78%
Memory Usage: 156 MB
CPU Usage: 2.3%

API Metrics Endpoint

curl http://localhost:7669/api/status

Returns:

{
"daemon": {
"uptime_seconds": 15780,
"sessions_active": 3,
"events_processed": 1247
},
"resources": {
"cpu_percent": 2.3,
"memory_mb": 156,
"goroutines": 42
},
"api": {
"requests_total": 892,
"cache_hit_rate": 0.78
}
}

Log Monitoring

Log Locations

ComponentFileRotation
Daemon~/.nc/logs/daemon.log10MB, 7 days
API~/.nc/logs/api.log10MB, 7 days
Sessions~/.nc/logs/sessions.log10MB, 7 days
Errors~/.nc/logs/error.log10MB, 30 days

Log Format

2026-01-16T13:45:23Z INFO [daemon] Session watcher started
2026-01-16T13:45:24Z DEBUG [api] GET /api/status 200 12ms
2026-01-16T13:45:30Z WARN [sessions] Session abc123 idle for 30m
2026-01-16T13:46:01Z ERROR [ollama] Connection refused

Log Aggregation

With journald (systemd):

journalctl -u nc-daemon -f

With Loki/Promtail:

# promtail config
scrape_configs:
- job_name: neural-commander
static_configs:
- targets:
- localhost
labels:
job: nc
__path__: /home/*/.nc/logs/*.log

Alerting

Built-in Alerts

NC has a built-in alert system that injects warnings into CLAUDE.md:

AlertTriggerSeverity
Uncommitted code>10 files modifiedWARNING
Long session>4 hours continuousINFO
High memory>80% of limitWARNING
Session crashExit code != 0CRITICAL

External Alerting

Prometheus Alert Rules:

groups:
- name: neural-commander
rules:
- alert: NCDaemonDown
expr: up{job="nc"} == 0
for: 5m
labels:
severity: critical
annotations:
summary: "NC Daemon is down"

- alert: NCHighMemory
expr: nc_memory_usage_mb > 3000
for: 10m
labels:
severity: warning
annotations:
summary: "NC using >3GB memory"

Simple Cron Check:

# /etc/cron.d/nc-monitor
*/5 * * * * user curl -sf http://localhost:7669/health || echo "NC down" | mail -s "NC Alert" admin@example.com

Admin Console Dashboard

ncmd admin

Provides real-time views:

  • Overview: System status, resource usage
  • Events: Live event stream
  • Sessions: Active Claude sessions
  • Intelligence: Pattern detection status

Keyboard Shortcuts

KeyAction
TabSwitch views
qQuit
rRefresh
?Help

Resource Monitoring

Resource Governor Status

ncmd daemon status

Shows:

Resource Governor: ACTIVE
Mode: interactive
CPU Limit: 80%
Memory Limit: 4096 MB
Current CPU: 2.3%
Current Memory: 156 MB

Per-Process Metrics

# Using top
top -p $(pgrep -f neural-commander)

# Using htop
htop -p $(pgrep -f neural-commander)

# Using ps
ps -o pid,ppid,cmd,%cpu,%mem -p $(pgrep -f neural-commander)

Integration with Monitoring Systems

Prometheus

Metrics endpoint (Pro tier):

GET /api/metrics

Returns Prometheus-format metrics:

# HELP nc_daemon_uptime_seconds Daemon uptime in seconds
# TYPE nc_daemon_uptime_seconds gauge
nc_daemon_uptime_seconds 15780

# HELP nc_sessions_active Number of active sessions
# TYPE nc_sessions_active gauge
nc_sessions_active 3

Grafana Dashboard

Import dashboard ID: NC-001 (available in Pro tier)

Panels:

  • Daemon uptime
  • Session count over time
  • API request rate
  • Cache hit ratio
  • Resource usage (CPU/Memory)

Datadog

# datadog.yaml
logs:
- type: file
path: /home/*/.nc/logs/*.log
service: neural-commander
source: go

Performance Baselines

Normal Operation

MetricExpected Range
CPU1-5% idle, 10-30% active
Memory100-300 MB
API Latency<50ms (p99)
Cache Hit Rate>70%

Warning Thresholds

MetricWarningCritical
CPU>50%>80%
Memory>2GB>3.5GB
API Latency>200ms>1000ms
Error Rate>1%>5%