Skip to content

Add NAS Doctor — local NAS diagnostic and monitoring tool#4804

Open
mcdays94 wants to merge 7 commits into
truenas:masterfrom
mcdays94:add-nas-doctor
Open

Add NAS Doctor — local NAS diagnostic and monitoring tool#4804
mcdays94 wants to merge 7 commits into
truenas:masterfrom
mcdays94:add-nas-doctor

Conversation

@mcdays94
Copy link
Copy Markdown

Description

NAS Doctor is a local diagnostic and monitoring tool for your NAS. It runs periodic health checks — analyzing SMART data, disk usage, Docker containers, ZFS pools, UPS power, tunnels, and more — then surfaces findings with actionable recommendations backed by Backblaze failure rate data.

App Information

Key Features

  • 20+ diagnostic rules with automatic root-cause correlation
  • SMART health with Backblaze failure-rate thresholds (337k+ drives)
  • Service checks: HTTP, TCP, DNS, Ping/ICMP, SMB, NFS with per-check intervals
  • Tunnel monitoring: Cloudflared and Tailscale (host + Docker detection)
  • ZFS pool health: vdev tree, scrub/resilver, ARC stats, datasets
  • UPS / power monitoring via NUT or apcupsd
  • 3 dashboard themes (Midnight, Clean, Ember)
  • Prometheus /metrics endpoint (80+ gauges)
  • Webhook alerts: Discord, Slack, Gotify, Ntfy, generic HTTP
  • Log forwarding: Loki, syslog, HTTP JSON
  • Multi-server fleet monitoring

Testing

  • Tested locally with Docker Compose
  • All test files pass (basic-values.yaml)
  • Health check configured (/api/v1/health)
  • Portal configured (web UI)

Icons and Screenshots

Special Notes

  • NAS Doctor requires SYS_RAWIO capability for smartctl to read SMART disk data
  • Runs as root (uid 0) for device access — same requirement as on Unraid/Synology
  • Optional host mounts for /var/log (log analysis) and /mnt (disk space monitoring)
  • Docker socket mount is optional but recommended for container monitoring
  • ZFS pool health works automatically on TrueNAS — no extra configuration needed

Checklist

  • app.yaml metadata is complete
  • questions.yaml has clear labels and descriptions
  • Test files pass
  • README.md is written
  • Only files under /ix-dev/ are modified

mcdays94 and others added 7 commits April 10, 2026 17:10
Breaking changes from v0.5.9:
- Update app_version and image tag to 0.9.0
- Add pid_mode: host for Top Processes feature (new in v0.9.0)
  - Enables the container to see all host processes and match them to
    Docker containers via cgroup inspection
- Add /dev mount toggle (default: true) — required for SMART on most systems
- Add /sys mount toggle (default: true) — required for GPU telemetry

Other changes:
- Default scan interval changed from 6h to 30m (matches v0.8.8+ default)
- Updated description to mention GPU, process CPU, network speed
- Added zfs and gpu keywords
- Updated README with feature list and notes
- Added host_mounts entries for /dev and /sys in app.yaml
@mcdays94
Copy link
Copy Markdown
Author

Updated to NAS Doctor v0.9.0. Key changes since the initial submission:

Version bump (v0.5.9 → v0.9.0)

  • Multiple major features added: architecture hardening sprint, Top Processes with Docker container attribution, GPU monitoring, speed test scheduling, clickable process history charts.

Top Processes feature requires pid_mode: host
Added via c1.set_pid_mode("host") in the compose template. Without it, the container only sees its own processes (not all host processes) and cannot match them to Docker containers via cgroup inspection.

New mount toggles (default: true)

  • mount_dev — mounts /dev read-only, required for SMART on most systems
  • mount_sys — mounts /sys read-only, required for GPU telemetry (NVIDIA, Intel, AMD)

Other updates

  • Default scan interval: 6h → 30m (matches v0.8.8+ upstream default)
  • Updated description, keywords, README to reflect new capabilities

Full changelog: https://github.com/mcdays94/nas-doctor/releases/tag/v0.9.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants