diff --git a/README.md b/README.md index 0da2e25..6f19313 100644 --- a/README.md +++ b/README.md @@ -24,6 +24,10 @@ Fan control and keyboard lighting for HP Victus / Omen laptops on Linux. Stock f - **GNOME Shell extension**: supported on GNOME Shell 45+ and auto-installed by `install.sh` when GNOME is present. - **Ubuntu GNOME**: the extension should work on GNOME-based Ubuntu setups if `victus-backend.service` is already installed and reachable, but the project does not currently ship a full Ubuntu installer. +## Ubuntu / Secure Boot (userspace, no kernel module) + +If you can't load the patched `hp-wmi` DKMS module — most notably on **Ubuntu with Secure Boot enabled**, where unsigned out-of-tree modules are rejected — see [`victus-fan/`](victus-fan/). It is a self-contained **userspace** controller that drives the **stock** in-tree `hp-wmi` driver (two-state `pwm1_enable`, no kernel module), tying fan speed to your power profile and CPU / iGPU / NVIDIA temperature via a small daemon + CLI (no GUI). Tested on an HP Victus 15-fb0xxx running Ubuntu 26.04 — see [`victus-fan/README.md`](victus-fan/README.md). + ## System Requirements - 64-bit Linux with `systemd`. - Supported installer targets: diff --git a/victus-fan/README.md b/victus-fan/README.md new file mode 100644 index 0000000..af9c5cb --- /dev/null +++ b/victus-fan/README.md @@ -0,0 +1,315 @@ +# victus-fan + +A small **userspace** fan controller for the **HP Victus 15-fb0xxx** +(AMD Ryzen 5 5600H iGPU + NVIDIA RTX 3050 dGPU) running **Ubuntu 26.04** +(kernel 7.0) with **Secure Boot ON**. + +> **No kernel module.** victus-fan uses *only* the stock, in-tree `hp-wmi` +> driver. Nothing needs to be compiled, signed, or DKMS-rebuilt, so it keeps +> working under Secure Boot and across kernel upgrades. + +> **No GUI.** victus-fan is a background daemon plus a small command-line tool +> (`victus-fanctl`). Fan behaviour follows your power mode automatically; there +> is nothing to open. + +## ✅ Tested & verified + +Built and tested **end-to-end on real hardware** — an **HP Victus 15-fb0xxx** +(hostname `jayesh-Victus`) running **Ubuntu 26.04 LTS** (kernel +`7.0.0-22-generic`, Secure Boot **ON**). Every power mode, the heat-triggered +MAX override, the manual overrides, the permission model, and the +restore-fans-to-firmware-on-stop safety path were all verified live, and it is +installed as a systemd service in daily use on that machine. + +Full report: **[docs/TESTING.md](docs/TESTING.md)**. + +--- + +## What it does + +A root daemon (`victus-fand`) watches your CPU, integrated-GPU and (when awake) +discrete-GPU temperatures, your power profile and your battery state, and flips +the laptop fans between two states using the hp-wmi `pwm1_enable` knob: + +* **MAX** — full speed (`pwm1_enable = 0`) +* **AUTO** — HP firmware fan curve (`pwm1_enable = 2`) + +It writes a world-readable status file (`/run/victus-fan/status.json`) that the +`victus-fanctl` CLI reads to show live state. + +### The hardware reality (why it is two-state) + +On this machine the hp-wmi `pwm1_enable` is the **only** fan knob the firmware +exposes to Linux, and it accepts just two usable values: + +| Value | Meaning | +|-------|---------------------------------| +| `0` | **MAX** — fans at full speed | +| `2` | **AUTO** — HP firmware curve | +| `1` | *rejected by the driver* | + +There is **no PWM duty file** and **no fan target file** — you cannot set an +arbitrary fan percentage or a custom RPM. That means: + +* **"performance"** maps to **continuous MAX** — exact and under our control. +* **Heat-triggered MAX** (the hysteresis below) is exact and under our control. +* **"quiet" / "cool" / "off"** are *delegated to the HP firmware's own AUTO + curve* — we simply leave the fans in AUTO and let the firmware be quiet. We + cannot make them quieter than HP's firmware already does, because there is no + finer knob to turn. + +So victus-fan's job is precise where it can be (force MAX on heat or on +performance) and otherwise gets out of the way (hand back to firmware AUTO). + +### Sensors used + +* **CPU** — `hwmon` named `k10temp` → `temp1_input` (Tctl). Fallbacks: + `coretemp`, `zenpower`. +* **iGPU (Radeon)** — `hwmon` named `amdgpu` → `temp1_input` (edge). +* **dGPU (NVIDIA)** — NVIDIA exposes **no hwmon**, so temperature is read with + `nvidia-smi`. See the no-wake note below. +* **Battery / AC** — `/sys/class/power_supply/*`. +* **Power profile** — `/sys/firmware/acpi/platform_profile` (read only; owned by + `power-profiles-daemon`). victus-fan **never writes** the platform profile — + it only ever writes `pwm1_enable`. + +### NVIDIA no-wake note + +The discrete GPU spends most of its life runtime-suspended to save battery. +victus-fan **only** calls `nvidia-smi` when the NVIDIA PCI device is already +`active` (its `power/runtime_status`), so polling the fan controller never wakes +the dGPU and never drains your battery. When the dGPU is suspended, the status +reports `dgpu: null` and `dgpu_state: "suspended"`, and it is simply excluded +from the fan decision. `nvidia-smi` calls are also guarded by a 2-second timeout +and tolerate the tool being absent. + +--- + +## Per-power-mode behaviour + +The active power profile (from `platform_profile`) is mapped to an internal +policy via `profile_map`, then that policy decides the fans: + +| Power profile (`platform_profile`) | Internal policy | Fan behaviour | +|------------------------------------|-----------------|-------------------------------------------------| +| `performance` | `performance` | **Always MAX** (full speed) | +| `balanced` | `balanced` | **AUTO**, forced to **MAX** on heat (hysteresis)| +| `quiet` | `power-saver` | **AUTO**, forced to MAX only at high temps | +| `cool` | `power-saver` | **AUTO**, forced to MAX only at high temps | + +> Note: GNOME's **power-saver** mode sets `platform_profile` to `quiet`, which +> maps to the `power-saver` policy above. + +Two extra rules sit on top: + +* **Low battery** — when on battery **and** capacity ≤ `low_battery_pct` + (default 20%), the policy is pushed to `power-saver` regardless of the power + profile, so a hot moment on a near-empty battery still spins the fans, but the + quiet thresholds apply. +* **Manual override** — you can force `max` or `auto` with `victus-fanctl`. The + daemon **resets the override to `policy` on startup**, so a forced state never + silently persists across a reboot. + +### Hysteresis (heat-triggered MAX) + +For every *present* sensor (CPU always; iGPU if available; dGPU only when +awake), the active policy defines an **on** and **off** threshold: + +* If **any** present sensor is **≥ its `_on`** threshold → **MAX**. +* Else, if currently MAX **and every** present sensor is **≤ its `_off`** + threshold → drop back to **AUTO**. +* Otherwise keep the current state (this gap between on/off prevents flapping). + +Default thresholds (°C): + +| Policy | CPU on/off | iGPU on/off | dGPU on/off | +|---------------|-----------:|------------:|------------:| +| `balanced` | 85 / 75 | 70 / 60 | 72 / 62 | +| `power-saver` | 90 / 80 | 78 / 68 | 80 / 70 | +| `performance` | always MAX | always MAX | always MAX | + +--- + +## Project layout + +``` +victus-fan/ # this folder, inside the victus-control repo +├── daemon/ +│ ├── victus-fand # root systemd daemon — the policy engine +│ └── victus-fanctl # command-line control + status tool +├── packaging/ +│ ├── victus-fan.service # systemd unit (runs the daemon as root) +│ ├── victus-fan.tmpfiles # creates /run/victus-fan +│ └── config.default.json # default thresholds / settings +├── docs/ +│ └── TESTING.md # hardware verification report +├── install.sh # Ubuntu installer (re-execs itself with sudo) +├── uninstall.sh # restores firmware fan control + removes files +└── README.md +``` + +> **License:** `victus-fan` is part of [victus-control](../) and is distributed +> under the **GNU GPL v3.0 or later**, the same license as the parent repository. + +## Install + +From the `victus-fan/` directory (inside a clone of the victus-control repo): + +```bash +sudo ./install.sh +``` + +The installer (re-execs itself with `sudo` if needed) will: + +1. Ensure `python3` is present (the daemon and CLI are pure Python stdlib — + there are no other dependencies). +2. Create the system group **`victusfan`** and add **you** (the `SUDO_USER`) + to it. +3. Install the daemon and CLI to their system paths (see below). +4. Create `/etc/victus-fan/` (`root:victusfan`, mode `2775`) and install the + default `config.json` **only if one does not already exist**. +5. Install + enable the systemd service and the tmpfiles runtime dir, then + start the daemon. + +> **Log out and back in** (or reboot) afterwards so that your **`victusfan` +> group** membership takes effect. That is only needed to edit the config with +> `victus-fanctl` as your normal user — running it with `sudo` works immediately, +> and the daemon itself works the moment it is installed. + +### Install targets + +| Component | Path | Mode / owner | +|---------------|----------------------------------------------------------|-------------------------| +| daemon | `/usr/lib/victus-fan/victus-fand` | `0755` root | +| CLI | `/usr/bin/victus-fanctl` | `0755` | +| service | `/etc/systemd/system/victus-fan.service` | runs as **root** | +| tmpfiles | `/etc/tmpfiles.d/victus-fan.conf` | `0644` | +| config dir | `/etc/victus-fan/` | `root:victusfan` `2775` | +| config file | `/etc/victus-fan/config.json` | `root:victusfan` `0664` | +| runtime dir | `/run/victus-fan/` | `0755 root root` | +| status file | `/run/victus-fan/status.json` | `0644` (world-readable) | + +### Privilege model + +* The **daemon runs as root** because it must write sysfs (`pwm1_enable`) and + `/run/victus-fan`. +* **`victus-fanctl` runs as you.** It only **reads** `/run/victus-fan/status.json` + (world-readable) and **writes** `/etc/victus-fan/config.json`. That config + directory is group `victusfan` with the **setgid** bit so any member can + atomically replace the file (write a temp file in the same dir, then `rename`). + This is why you must be in the `victusfan` group (and have re-logged in) to + change settings without `sudo`; otherwise a config write fails with a clear + "you must be in the 'victusfan' group and re-login" message. + +--- + +## Uninstall + +```bash +sudo ./uninstall.sh +``` + +This stops and disables the service, writes `2` (AUTO) to the hp-wmi +`pwm1_enable` to hand the fans back to firmware, and removes the installed +program files. It **leaves** `/etc/victus-fan/` and the `victusfan` group in +place; to remove those too: + +```bash +sudo rm -rf /etc/victus-fan +sudo groupdel victusfan +``` + +--- + +## Configuration + +Edit `/etc/victus-fan/config.json` (you must be in the `victusfan` group, or use +`sudo`). The daemon merges your file over its built-in defaults and **re-reads it +every poll loop**, so changes take effect within `poll_interval_sec` — no restart +needed. + +```json +{ + "enabled": true, + "poll_interval_sec": 2, + "low_battery_pct": 20, + "override": "policy", + "profile_map": { + "performance": "performance", + "balanced": "balanced", + "quiet": "power-saver", + "cool": "power-saver", + "low-power": "power-saver" + }, + "profiles": { + "performance": { "always_max": true }, + "balanced": { "cpu_on": 85, "cpu_off": 75, "igpu_on": 70, "igpu_off": 60, "dgpu_on": 72, "dgpu_off": 62 }, + "power-saver": { "cpu_on": 90, "cpu_off": 80, "igpu_on": 78, "igpu_off": 68, "dgpu_on": 80, "dgpu_off": 70 } + } +} +``` + +| Key | Meaning | +|---------------------|-------------------------------------------------------------------------| +| `enabled` | Master switch. `false` → leave fans in AUTO and do nothing. | +| `poll_interval_sec` | Seconds between evaluations. | +| `low_battery_pct` | At/below this % on battery, force the `power-saver` policy. | +| `override` | `policy` (normal), `max` (force full speed), `auto` (force firmware). | +| `profile_map` | Maps `platform_profile` values to a policy name. | +| `profiles.*` | `performance` is `always_max`; the others give per-sensor on/off °C. | + +> `override` is reset to `policy` every time the daemon starts, so a forced +> state cannot survive a reboot. + +--- + +## `victus-fanctl` usage + +```bash +victus-fanctl status # pretty-print the current status.json +victus-fanctl get # print the effective (merged) config +victus-fanctl override max # force fans to MAX +victus-fanctl override auto # force firmware AUTO +victus-fanctl override policy # return to automatic policy +victus-fanctl set balanced cpu_on 80 # tune a single threshold +victus-fanctl enable # master switch on +victus-fanctl disable # master switch off (leaves fans in AUTO) +``` + +`override` / `set` / `enable` / `disable` work by atomically rewriting +`/etc/victus-fan/config.json`, so they require `victusfan` group membership (or +`sudo`). If you see a permission error, you are not yet in the group (or have not +re-logged in since installing). + +--- + +## Troubleshooting + +* **"PermissionError" / cannot save config from `victus-fanctl`** — you are not + in the `victusfan` group yet, or you have not logged out and back in since + installing. Check with `id` (look for `victusfan`), or just prefix the command + with `sudo`. + +* **Service / daemon logs**: + + ```bash + systemctl status victus-fan.service + journalctl -u victus-fan -f + ``` + +* **Fans seem stuck** — check the live decision: + + ```bash + victus-fanctl status # shows fan.state, override, decision_reason + ``` + + Remember `performance` is *always* MAX by design. To hand control fully back + to firmware, run `victus-fanctl override auto` (temporary) or set + `enabled: false`. + +* **dGPU temp shows `null`** — that is expected when the NVIDIA GPU is + runtime-suspended; victus-fan deliberately does not wake it. The temperature + appears once something is using the dGPU. + +* **Reset to firmware quickly** — `victus-fanctl override auto`, or stop the + service: `sudo systemctl stop victus-fan` (the daemon writes AUTO on exit). diff --git a/victus-fan/daemon/victus-fanctl b/victus-fan/daemon/victus-fanctl new file mode 100755 index 0000000..8e30178 --- /dev/null +++ b/victus-fan/daemon/victus-fanctl @@ -0,0 +1,336 @@ +#!/usr/bin/env python3 +# -*- coding: utf-8 -*- +# SPDX-License-Identifier: GPL-3.0-or-later +# Copyright (C) 2026 Jayesh Barman +""" +victus-fanctl — command-line control for victus-fand. + +Runs as the user. It only READS /run/victus-fan/status.json (world-readable) +and WRITES /etc/victus-fan/config.json atomically (group victusfan, setgid dir). + +Subcommands: + status pretty-print the live daemon status + get print the merged effective config (JSON) + override max|auto|policy force fans MAX/AUTO, or hand back to policy + set set an integer threshold in profiles. + enable set enabled=true + disable set enabled=false + +Pure python3 stdlib only. +""" + +import json +import os +import sys + +CONFIG_PATH = "/etc/victus-fan/config.json" +STATUS_PATH = "/run/victus-fan/status.json" + +# Mirror of the daemon defaults so `get` and `set` behave even before the +# daemon has ever written a config (kept intentionally in sync with the daemon). +DEFAULT_CONFIG = { + "enabled": True, + "poll_interval_sec": 2, + "low_battery_pct": 20, + "override": "policy", + "profile_map": { + "performance": "performance", + "balanced": "balanced", + "quiet": "power-saver", + "cool": "power-saver", + "low-power": "power-saver", + }, + "profiles": { + "performance": {"always_max": True}, + "balanced": { + "cpu_on": 85, "cpu_off": 75, + "igpu_on": 70, "igpu_off": 60, + "dgpu_on": 72, "dgpu_off": 62, + }, + "power-saver": { + "cpu_on": 90, "cpu_off": 80, + "igpu_on": 78, "igpu_off": 68, + "dgpu_on": 80, "dgpu_off": 70, + }, + }, +} + +VALID_OVERRIDES = ("max", "auto", "policy") + +PERM_HELP = ( + "Permission denied writing %s.\n" + "You must be a member of the 'victusfan' group, then log out and back in\n" + "(or run: newgrp victusfan). Alternatively run this command via pkexec/sudo." +) % CONFIG_PATH + + +# ---------------------------------------------------------------------------- +# Helpers +# ---------------------------------------------------------------------------- +def err(msg): + sys.stderr.write(msg.rstrip("\n") + "\n") + + +def deep_merge(base, override): + out = dict(base) + for k, v in override.items(): + if k in out and isinstance(out[k], dict) and isinstance(v, dict): + out[k] = deep_merge(out[k], v) + else: + out[k] = v + return out + + +def load_raw_config(): + """ + Load the on-disk config dict (NOT merged with defaults), so we can mutate + and write it back preserving exactly what the user has. Returns {} if the + file does not exist; raises on JSON errors. + """ + try: + with open(CONFIG_PATH, "r") as f: + data = json.load(f) + except FileNotFoundError: + return {} + if not isinstance(data, dict): + raise ValueError("config root is not a JSON object") + return data + + +def load_effective_config(): + """Return on-disk config merged over defaults (what the daemon would use).""" + try: + raw = load_raw_config() + except Exception as exc: # noqa: BLE001 + err("warning: config unreadable (%s); showing defaults" % exc) + raw = {} + return deep_merge(DEFAULT_CONFIG, raw) + + +def atomic_write_config(data): + """ + Atomically write `data` (dict) to CONFIG_PATH by creating a temp file in the + SAME directory and os.replace()-ing it onto config.json. Surfaces a clear + message on PermissionError. + """ + directory = os.path.dirname(CONFIG_PATH) or "." + payload = json.dumps(data, indent=2) + "\n" + tmp = os.path.join(directory, ".config.json.tmp.%d" % os.getpid()) + try: + # Honor the setgid dir + umask; aim for 0664 so the group can rewrite. + fd = os.open(tmp, os.O_WRONLY | os.O_CREAT | os.O_TRUNC, 0o664) + with os.fdopen(fd, "w") as f: + f.write(payload) + f.flush() + os.fsync(f.fileno()) + # Defeat the process umask (which would knock 0664 down to 0644) so the + # file stays group-writable for other 'victusfan' members. chmod ignores + # umask; the writer always owns this freshly-created temp so it succeeds. + try: + os.chmod(tmp, 0o664) + except OSError: + pass + os.replace(tmp, CONFIG_PATH) + except PermissionError: + _cleanup(tmp) + err(PERM_HELP) + sys.exit(13) + except OSError as exc: + _cleanup(tmp) + err("error writing config: %s" % exc) + sys.exit(1) + + +def _cleanup(path): + try: + os.unlink(path) + except OSError: + pass + + +def mutate_config(mutator): + """Load raw config, apply mutator(dict)->dict, write atomically.""" + try: + raw = load_raw_config() + except Exception as exc: # noqa: BLE001 + err("error: cannot parse existing config (%s)" % exc) + sys.exit(1) + new = mutator(raw) + atomic_write_config(new) + + +# ---------------------------------------------------------------------------- +# Subcommands +# ---------------------------------------------------------------------------- +def cmd_status(_args): + try: + with open(STATUS_PATH, "r") as f: + st = json.load(f) + except FileNotFoundError: + err("no status yet at %s — is victus-fand running? (systemctl status victus-fan)" + % STATUS_PATH) + return 1 + except Exception as exc: # noqa: BLE001 + err("error reading status: %s" % exc) + return 1 + + temps = st.get("temps", {}) or {} + fan = st.get("fan", {}) or {} + + def fmt_temp(v): + return "n/a" if v is None else ("%.1f C" % v) + + def fmt_rpm(v): + return "n/a" if v is None else ("%d rpm" % v) + + lines = [] + lines.append("victus-fan status") + lines.append(" enabled : %s" % st.get("enabled")) + lines.append(" override : %s" % st.get("override")) + lines.append(" power profile : %s" % st.get("power_profile_raw")) + lines.append(" policy : %s" % st.get("policy")) + lines.append(" effective : %s" % st.get("effective_policy")) + batt = st.get("battery_pct") + batt_s = "n/a" if batt is None else ("%d%%" % batt) + src = "battery" if st.get("on_battery") else "AC" + low = " (LOW)" if st.get("low_battery_active") else "" + lines.append(" power source : %s battery %s%s" % (src, batt_s, low)) + lines.append(" temps : cpu %s | igpu %s | dgpu %s [%s]" % ( + fmt_temp(temps.get("cpu")), + fmt_temp(temps.get("igpu")), + fmt_temp(temps.get("dgpu")), + temps.get("dgpu_state"), + )) + lines.append(" fan : %s (fan1 %s, fan2 %s, pwm_enable=%s)" % ( + fan.get("state"), + fmt_rpm(fan.get("rpm1")), + fmt_rpm(fan.get("rpm2")), + fan.get("pwm_enable"), + )) + lines.append(" reason : %s" % st.get("decision_reason")) + ts = st.get("ts") + if isinstance(ts, (int, float)): + age = max(0.0, _now() - ts) + lines.append(" updated : %.1fs ago" % age) + print("\n".join(lines)) + return 0 + + +def _now(): + import time + return time.time() + + +def cmd_get(_args): + print(json.dumps(load_effective_config(), indent=2)) + return 0 + + +def cmd_override(args): + if len(args) != 1 or args[0] not in VALID_OVERRIDES: + err("usage: victus-fanctl override max|auto|policy") + return 2 + value = args[0] + + def m(cfg): + cfg["override"] = value + return cfg + + mutate_config(m) + print("override set to '%s'" % value) + return 0 + + +def cmd_set(args): + if len(args) != 3: + err("usage: victus-fanctl set ") + return 2 + profile, key, raw_val = args + try: + value = int(raw_val) + except ValueError: + err("error: value must be an integer, got %r" % raw_val) + return 2 + + def m(cfg): + profiles = cfg.get("profiles") + if not isinstance(profiles, dict): + profiles = {} + cfg["profiles"] = profiles + prof = profiles.get(profile) + if not isinstance(prof, dict): + prof = {} + profiles[profile] = prof + prof[key] = value + return cfg + + mutate_config(m) + print("set profiles.%s.%s = %d" % (profile, key, value)) + return 0 + + +def cmd_enable(_args): + def m(cfg): + cfg["enabled"] = True + return cfg + + mutate_config(m) + print("enabled") + return 0 + + +def cmd_disable(_args): + def m(cfg): + cfg["enabled"] = False + return cfg + + mutate_config(m) + print("disabled") + return 0 + + +USAGE = """\ +victus-fanctl — control victus-fand + +usage: + victus-fanctl status + victus-fanctl get + victus-fanctl override max|auto|policy + victus-fanctl set + victus-fanctl enable + victus-fanctl disable +""" + + +def main(argv): + if not argv: + sys.stderr.write(USAGE) + return 2 + + cmd = argv[0] + rest = argv[1:] + + handlers = { + "status": cmd_status, + "get": cmd_get, + "override": cmd_override, + "set": cmd_set, + "enable": cmd_enable, + "disable": cmd_disable, + } + + if cmd in ("-h", "--help", "help"): + sys.stdout.write(USAGE) + return 0 + + handler = handlers.get(cmd) + if handler is None: + err("unknown command: %s" % cmd) + sys.stderr.write(USAGE) + return 2 + + return handler(rest) + + +if __name__ == "__main__": + sys.exit(main(sys.argv[1:])) diff --git a/victus-fan/daemon/victus-fand b/victus-fan/daemon/victus-fand new file mode 100755 index 0000000..3d3935a --- /dev/null +++ b/victus-fan/daemon/victus-fand @@ -0,0 +1,729 @@ +#!/usr/bin/env python3 +# -*- coding: utf-8 -*- +# SPDX-License-Identifier: GPL-3.0-or-later +# Copyright (C) 2026 Jayesh Barman +""" +victus-fand — userspace fan controller daemon for HP Victus 15-fb0xxx. + +SAFETY-CRITICAL. Runs as root. Controls the laptop fans by writing ONLY the +stock hp-wmi driver's pwm1_enable file (two-state: "0"=MAX, "2"=AUTO). It never +writes platform_profile and never touches any other sysfs control. + +Design goals: + * Never die on a transient error: the whole loop body is wrapped in try/except. + * Never wake a suspended NVIDIA dGPU: nvidia-smi is only invoked when the PCI + device's runtime power status is "active". + * Always hand the fans back to firmware (pwm1_enable=2 / AUTO) on exit. + * Re-discover sysfs paths if a cached path disappears (hwmon numbering can move). + +Pure python3 stdlib only. +""" + +import errno +import json +import os +import signal +import subprocess +import sys +import time + +DAEMON_VERSION = "1.0" + +CONFIG_PATH = "/etc/victus-fan/config.json" +STATUS_PATH = "/run/victus-fan/status.json" + +HP_WMI_HWMON_GLOB_ROOT = "/sys/devices/platform/hp-wmi/hwmon" +HWMON_CLASS_ROOT = "/sys/class/hwmon" +POWER_SUPPLY_ROOT = "/sys/class/power_supply" +PLATFORM_PROFILE_PATH = "/sys/firmware/acpi/platform_profile" +PCI_DEVICES_ROOT = "/sys/bus/pci/devices" + +# pwm1_enable semantics for this hardware (verified live): +# "0" -> MAX (full speed) +# "2" -> AUTO (firmware curve) +# Writing "1" is rejected by the driver; we never use it. +PWM_MAX = "0" +PWM_AUTO = "2" + +NVIDIA_VENDOR = "0x10de" +NVIDIA_SMI_TIMEOUT_SEC = 2 + +# ---------------------------------------------------------------------------- +# Default configuration (single source of truth mirrored in config.default.json) +# ---------------------------------------------------------------------------- +DEFAULT_CONFIG = { + "enabled": True, + "poll_interval_sec": 2, + "low_battery_pct": 20, + "override": "policy", # "policy" | "max" | "auto" + "profile_map": { + "performance": "performance", + "balanced": "balanced", + "quiet": "power-saver", + "cool": "power-saver", + "low-power": "power-saver", + }, + "profiles": { + "performance": {"always_max": True}, + "balanced": { + "cpu_on": 85, "cpu_off": 75, + "igpu_on": 70, "igpu_off": 60, + "dgpu_on": 72, "dgpu_off": 62, + }, + "power-saver": { + "cpu_on": 90, "cpu_off": 80, + "igpu_on": 78, "igpu_off": 68, + "dgpu_on": 80, "dgpu_off": 70, + }, + }, +} + + +# ---------------------------------------------------------------------------- +# Small helpers +# ---------------------------------------------------------------------------- +def log(msg): + """Log to stderr (captured by systemd journal).""" + sys.stderr.write("victus-fand: " + msg + "\n") + sys.stderr.flush() + + +def read_text(path): + """Read a sysfs/text file, returning stripped contents or None on error.""" + try: + with open(path, "r") as f: + return f.read().strip() + except (OSError, IOError): + return None + + +def read_int(path): + """Read an integer from a sysfs file, returning None on error.""" + val = read_text(path) + if val is None: + return None + try: + return int(val) + except ValueError: + # Some sysfs files may have trailing junk; try the first token. + try: + return int(val.split()[0]) + except (ValueError, IndexError): + return None + + +def listdir(path): + try: + return os.listdir(path) + except OSError: + return [] + + +def atomic_write(path, data, mode=0o644): + """ + Atomically write `data` (str) to `path` by writing a temp file in the SAME + directory and os.replace()-ing it into place. + """ + directory = os.path.dirname(path) or "." + tmp = os.path.join(directory, ".%s.tmp.%d" % (os.path.basename(path), os.getpid())) + fd = os.open(tmp, os.O_WRONLY | os.O_CREAT | os.O_TRUNC, mode) + try: + with os.fdopen(fd, "w") as f: + f.write(data) + f.flush() + os.fsync(f.fileno()) + os.replace(tmp, path) + try: + os.chmod(path, mode) + except OSError: + pass + except Exception: + try: + os.unlink(tmp) + except OSError: + pass + raise + + +# ---------------------------------------------------------------------------- +# Discovery — all paths discovered dynamically, cached, and re-validated. +# ---------------------------------------------------------------------------- +def find_hp_wmi_hwmon(): + """ + Find the hp-wmi hwmon directory that exposes pwm1_enable + fan1_input. + Returns the hwmon directory path, or None. + """ + base = HP_WMI_HWMON_GLOB_ROOT + for name in sorted(listdir(base)): + if not name.startswith("hwmon"): + continue + d = os.path.join(base, name) + if os.path.exists(os.path.join(d, "pwm1_enable")): + return d + return None + + +def find_hwmon_by_name(names): + """ + Scan /sys/class/hwmon/* for a device whose `name` file matches one of + `names` (in priority order). Returns the hwmon dir path, or None. + `names` is a list to allow fallbacks (e.g. k10temp -> coretemp -> zenpower). + """ + # Build a map name->dir first so we can honor caller priority order. + found = {} + for entry in sorted(listdir(HWMON_CLASS_ROOT)): + d = os.path.join(HWMON_CLASS_ROOT, entry) + nm = read_text(os.path.join(d, "name")) + if nm is not None and nm not in found: + found[nm] = d + for want in names: + if want in found: + return found[want] + return None + + +def find_nvidia_pci_device(): + """ + Locate the NVIDIA dGPU PCI device (vendor 0x10de, class starts 0x0300). + Returns the sysfs device dir, or None. + """ + for entry in sorted(listdir(PCI_DEVICES_ROOT)): + d = os.path.join(PCI_DEVICES_ROOT, entry) + vendor = read_text(os.path.join(d, "vendor")) + klass = read_text(os.path.join(d, "class")) + if vendor is None or klass is None: + continue + if vendor.lower() == NVIDIA_VENDOR and klass.lower().startswith("0x0300"): + return d + return None + + +# ---------------------------------------------------------------------------- +# Sensor / fan / battery / profile readers +# ---------------------------------------------------------------------------- +def milli_to_c(milli): + if milli is None: + return None + return round(milli / 1000.0, 1) + + +def read_cpu_temp(cpu_hwmon): + if not cpu_hwmon: + return None + return milli_to_c(read_int(os.path.join(cpu_hwmon, "temp1_input"))) + + +def read_igpu_temp(igpu_hwmon): + if not igpu_hwmon: + return None + return milli_to_c(read_int(os.path.join(igpu_hwmon, "temp1_input"))) + + +def nvidia_runtime_status(nvidia_dev): + """ + Return the runtime power status string for the NVIDIA device + (e.g. "active", "suspended", "suspending"), or None if unknown. + """ + if not nvidia_dev: + return None + status = read_text(os.path.join(nvidia_dev, "power", "runtime_status")) + return status + + +def read_dgpu(nvidia_dev): + """ + Read dGPU temperature WITHOUT waking it. + + Returns (temp_c_or_None, state_str). state_str is one of: + "active" -> GPU awake; temp populated if nvidia-smi succeeded + "suspended" -> GPU asleep; we deliberately did NOT call nvidia-smi + "unknown" -> could not determine (no device / no runtime status) + """ + status = nvidia_runtime_status(nvidia_dev) + + if status is None: + # No NVIDIA device or no runtime power info — treat as unknown/absent. + return None, "unknown" + + if status != "active": + # Suspended/suspending/etc. Do NOT call nvidia-smi (would wake it). + return None, "suspended" + + # Device is active: safe to query nvidia-smi (guarded for missing/timeout). + try: + out = subprocess.run( + [ + "nvidia-smi", + "--query-gpu=temperature.gpu,utilization.gpu", + "--format=csv,noheader,nounits", + ], + capture_output=True, + text=True, + timeout=NVIDIA_SMI_TIMEOUT_SEC, + ) + except FileNotFoundError: + # nvidia-smi not installed. + return None, "active" + except subprocess.TimeoutExpired: + return None, "active" + except Exception as exc: # noqa: BLE001 - never let this crash the loop + log("nvidia-smi error: %s" % exc) + return None, "active" + + if out.returncode != 0: + return None, "active" + + line = out.stdout.strip().splitlines() + if not line: + return None, "active" + first = line[0].strip() + parts = [p.strip() for p in first.split(",")] + try: + temp = float(parts[0]) + except (ValueError, IndexError): + return None, "active" + return round(temp, 1), "active" + + +def read_fan(hp_hwmon): + """Return (rpm1, rpm2, pwm_enable_int).""" + rpm1 = read_int(os.path.join(hp_hwmon, "fan1_input")) if hp_hwmon else None + rpm2 = read_int(os.path.join(hp_hwmon, "fan2_input")) if hp_hwmon else None + pwm = read_int(os.path.join(hp_hwmon, "pwm1_enable")) if hp_hwmon else None + return rpm1, rpm2, pwm + + +def read_battery_and_ac(): + """ + Scan /sys/class/power_supply/*. + Returns (on_battery: bool, battery_pct: int|None). + + on_battery is True when a Mains supply reports online==0, OR (if no Mains + found) when a Battery reports status "Discharging". + """ + battery_pct = None + battery_status = None + mains_online = None + + for entry in sorted(listdir(POWER_SUPPLY_ROOT)): + d = os.path.join(POWER_SUPPLY_ROOT, entry) + ps_type = read_text(os.path.join(d, "type")) + if ps_type == "Battery": + cap = read_int(os.path.join(d, "capacity")) + if cap is not None and battery_pct is None: + battery_pct = cap + st = read_text(os.path.join(d, "status")) + if st is not None and battery_status is None: + battery_status = st + elif ps_type == "Mains": + online = read_int(os.path.join(d, "online")) + if online is not None: + # If any mains is online, we are not on battery. + if mains_online is None: + mains_online = online + elif online == 1: + mains_online = 1 + + if mains_online is not None: + on_battery = (mains_online == 0) + elif battery_status is not None: + on_battery = (battery_status.lower() == "discharging") + else: + on_battery = False + + return on_battery, battery_pct + + +def read_platform_profile(): + """Return the current platform_profile (e.g. 'performance'), or None.""" + return read_text(PLATFORM_PROFILE_PATH) + + +# ---------------------------------------------------------------------------- +# Config handling +# ---------------------------------------------------------------------------- +def deep_merge(base, override): + """Recursively merge `override` onto a copy of `base` and return it.""" + out = dict(base) + for k, v in override.items(): + if k in out and isinstance(out[k], dict) and isinstance(v, dict): + out[k] = deep_merge(out[k], v) + else: + out[k] = v + return out + + +def load_config(): + """ + Load config.json merged over DEFAULT_CONFIG. + On any failure, return a copy of DEFAULT_CONFIG. + """ + try: + with open(CONFIG_PATH, "r") as f: + user_cfg = json.load(f) + if not isinstance(user_cfg, dict): + raise ValueError("config root is not an object") + return deep_merge(DEFAULT_CONFIG, user_cfg) + except FileNotFoundError: + return json.loads(json.dumps(DEFAULT_CONFIG)) + except Exception as exc: # noqa: BLE001 + log("config load failed (%s); using defaults" % exc) + return json.loads(json.dumps(DEFAULT_CONFIG)) + + +def reset_override_on_startup(): + """ + On startup, force config.override back to "policy" so a forced MAX/AUTO + state never persists across reboots. Only rewrites the file if it actually + needs changing, and preserves all other keys. Written atomically. + """ + try: + with open(CONFIG_PATH, "r") as f: + raw = json.load(f) + except FileNotFoundError: + # No config file yet — nothing to reset; daemon will use defaults. + return + except Exception as exc: # noqa: BLE001 + log("startup override reset: cannot read config (%s); skipping" % exc) + return + + if not isinstance(raw, dict): + return + + if raw.get("override", "policy") == "policy": + return # already clean + + raw["override"] = "policy" + try: + atomic_write(CONFIG_PATH, json.dumps(raw, indent=2) + "\n", mode=0o664) + log("startup: reset override -> policy") + except Exception as exc: # noqa: BLE001 + log("startup override reset failed: %s" % exc) + + +# ---------------------------------------------------------------------------- +# Decision logic +# ---------------------------------------------------------------------------- +def compute_hysteresis(profile_name, thresholds, temps, prev_state): + """ + Apply hysteresis over present sensors. + + Present sensors: cpu always (if not None); igpu if not None; dgpu only if + not None. For each present sensor we use _on / _off. + + Rules: + * If ANY present sensor >= its _on threshold -> MAX. + * elif prev_state == MAX and ALL present sensors <= their _off -> AUTO. + * else keep prev_state. + + Returns (desired_pwm, reason). + """ + present = [] # list of (label, value, on_thr, off_thr) + for label in ("cpu", "igpu", "dgpu"): + val = temps.get(label) + if val is None: + continue + on_thr = thresholds.get(label + "_on") + off_thr = thresholds.get(label + "_off") + if on_thr is None or off_thr is None: + continue + present.append((label, val, on_thr, off_thr)) + + if not present: + # No usable sensors — be safe and keep previous state. + reason = "%s: no sensors -> keep %s" % ( + profile_name, + "MAX" if prev_state == PWM_MAX else "AUTO", + ) + return prev_state, reason + + # Any sensor at/over its ON threshold => MAX. + for label, val, on_thr, off_thr in present: + if val >= on_thr: + reason = "%s: %s %g>=%g -> MAX" % (profile_name, label, val, on_thr) + return PWM_MAX, reason + + # Currently MAX: only drop to AUTO when ALL present sensors are at/below OFF. + if prev_state == PWM_MAX: + all_cool = all(val <= off_thr for _, val, _, off_thr in present) + if all_cool: + reason = "%s: all cool -> AUTO" % profile_name + return PWM_AUTO, reason + # Still in the hysteresis band -> stay MAX. + hot = [ + "%s %g>%g" % (label, val, off_thr) + for label, val, _, off_thr in present + if val > off_thr + ] + reason = "%s: hysteresis hold MAX (%s)" % (profile_name, ", ".join(hot)) + return PWM_MAX, reason + + # Currently AUTO and nothing over ON -> stay AUTO. + reason = "%s: below thresholds -> AUTO" % profile_name + return PWM_AUTO, reason + + +def decide(config, temps, on_battery, battery_pct, platform_profile, prev_state): + """ + Run the full decision logic. + Returns (desired_pwm, reason, policy, effective_policy). + """ + enabled = bool(config.get("enabled", True)) + if not enabled: + return PWM_AUTO, "disabled", "disabled", "disabled" + + override = config.get("override", "policy") + profile_map = config.get("profile_map", {}) + low_battery_pct = config.get("low_battery_pct", 20) + + # policy is derived from platform_profile even when overridden, so status + # can still report it; effective is what we'd act on under policy. + policy = profile_map.get(platform_profile, "balanced") + low_batt_active = ( + on_battery and battery_pct is not None and battery_pct <= low_battery_pct + ) + effective = "power-saver" if low_batt_active else policy + + if override == "max": + return PWM_MAX, "override: forced MAX", policy, effective + if override == "auto": + return PWM_AUTO, "override: forced AUTO", policy, effective + + # override == "policy" (or anything unrecognized) -> compute policy. + if effective == "performance": + return PWM_MAX, "performance: forced MAX", policy, effective + + profiles = config.get("profiles", {}) + thresholds = profiles.get(effective) + if not isinstance(thresholds, dict): + # Unknown profile or malformed -> fall back to balanced defaults. + thresholds = DEFAULT_CONFIG["profiles"]["balanced"] + + if thresholds.get("always_max"): + return PWM_MAX, "%s: always_max -> MAX" % effective, policy, effective + + desired, reason = compute_hysteresis(effective, thresholds, temps, prev_state) + return desired, reason, policy, effective + + +# ---------------------------------------------------------------------------- +# Daemon +# ---------------------------------------------------------------------------- +class Daemon: + def __init__(self): + self.running = True + # Cached discovered paths (re-discovered if they vanish). + self.hp_hwmon = None + self.cpu_hwmon = None + self.igpu_hwmon = None + self.nvidia_dev = None + # Persistent hysteresis state across loop iterations. Initial = AUTO. + self.prev_state = PWM_AUTO + + # --- path (re)discovery ------------------------------------------------- + def ensure_paths(self): + """(Re)discover any cached path that is missing/stale.""" + if not self.hp_hwmon or not os.path.exists( + os.path.join(self.hp_hwmon, "pwm1_enable") + ): + self.hp_hwmon = find_hp_wmi_hwmon() + + if not self.cpu_hwmon or not os.path.exists( + os.path.join(self.cpu_hwmon, "temp1_input") + ): + self.cpu_hwmon = find_hwmon_by_name(["k10temp", "coretemp", "zenpower"]) + + if not self.igpu_hwmon or not os.path.exists( + os.path.join(self.igpu_hwmon, "temp1_input") + ): + self.igpu_hwmon = find_hwmon_by_name(["amdgpu"]) + + if not self.nvidia_dev or not os.path.exists(self.nvidia_dev): + self.nvidia_dev = find_nvidia_pci_device() + + # --- fan control -------------------------------------------------------- + def read_current_pwm(self): + if not self.hp_hwmon: + return None + val = read_text(os.path.join(self.hp_hwmon, "pwm1_enable")) + return val + + def write_pwm(self, value): + """ + Write `value` ("0" or "2") to pwm1_enable. Returns True on success. + Logs and returns False on error (never raises). + """ + if not self.hp_hwmon: + log("cannot write pwm: hp-wmi hwmon not found") + return False + path = os.path.join(self.hp_hwmon, "pwm1_enable") + try: + with open(path, "w") as f: + f.write(value) + return True + except OSError as exc: + log("pwm write failed (%s -> %s): %s" % (value, path, exc)) + return False + + def apply_desired(self, desired): + """Write desired pwm only if different from current. Returns applied state.""" + current = self.read_current_pwm() + if current == desired: + return desired + ok = self.write_pwm(desired) + if ok: + return desired + # Write failed: report whatever the hardware currently is. + return current if current is not None else desired + + # --- one loop iteration ------------------------------------------------- + def step(self): + self.ensure_paths() + config = load_config() + + # Gather sensors. + cpu = read_cpu_temp(self.cpu_hwmon) + igpu = read_igpu_temp(self.igpu_hwmon) + dgpu, dgpu_state = read_dgpu(self.nvidia_dev) + temps = {"cpu": cpu, "igpu": igpu, "dgpu": dgpu} + + on_battery, battery_pct = read_battery_and_ac() + platform_profile = read_platform_profile() + + desired, reason, policy, effective = decide( + config, temps, on_battery, battery_pct, platform_profile, self.prev_state + ) + + applied = self.apply_desired(desired) + # Persist the state we actually acted on for hysteresis continuity. + # We track the *desired* logical state so hysteresis is stable even if a + # single write transiently fails. + self.prev_state = desired + + rpm1, rpm2, pwm_enable = read_fan(self.hp_hwmon) + # Prefer the freshly-read hardware pwm for reporting; fall back to applied. + eff_pwm = pwm_enable + if eff_pwm is None: + try: + eff_pwm = int(applied) + except (ValueError, TypeError): + eff_pwm = None + fan_state = "MAX" if eff_pwm == 0 else ("AUTO" if eff_pwm == 2 else "?") + + low_batt_active = ( + on_battery + and battery_pct is not None + and battery_pct <= config.get("low_battery_pct", 20) + ) + + status = { + "ts": round(time.time(), 3), + "daemon_version": DAEMON_VERSION, + "enabled": bool(config.get("enabled", True)), + "power_profile_raw": platform_profile, + "policy": policy, + "effective_policy": effective, + "on_battery": bool(on_battery), + "battery_pct": battery_pct, + "low_battery_active": bool(low_batt_active), + "override": config.get("override", "policy"), + "temps": { + "cpu": cpu, + "igpu": igpu, + "dgpu": dgpu, + "dgpu_state": dgpu_state, + }, + "fan": { + "rpm1": rpm1, + "rpm2": rpm2, + "pwm_enable": eff_pwm, + "state": fan_state, + }, + "decision_reason": reason, + } + + try: + atomic_write(STATUS_PATH, json.dumps(status) + "\n", mode=0o644) + except Exception as exc: # noqa: BLE001 + log("status write failed: %s" % exc) + + # --- main loop ---------------------------------------------------------- + def run(self): + log("starting (version %s)" % DAEMON_VERSION) + reset_override_on_startup() + + # Discover up-front (loop will re-discover if anything moves). + self.ensure_paths() + if not self.hp_hwmon: + log("WARNING: hp-wmi hwmon not found at startup; will keep retrying") + + while self.running: + loop_start = time.time() + try: + self.step() + except Exception as exc: # noqa: BLE001 - daemon must never die here + log("loop error (continuing): %s" % exc) + + # Poll interval is read fresh each loop from config (clamped sane). + try: + interval = float(load_config().get("poll_interval_sec", 2)) + except Exception: # noqa: BLE001 + interval = 2.0 + if interval < 0.5: + interval = 0.5 + if interval > 60: + interval = 60.0 + + # Sleep the remainder of the interval, responsive to shutdown. + elapsed = time.time() - loop_start + remaining = interval - elapsed + while remaining > 0 and self.running: + time.sleep(min(remaining, 0.5)) + remaining = interval - (time.time() - loop_start) + + self.shutdown() + + def shutdown(self): + """Hand fans back to firmware (AUTO) and exit cleanly.""" + log("shutting down; restoring AUTO (pwm1_enable=2)") + # Make sure we have a path even if discovery never ran. + if not self.hp_hwmon: + self.hp_hwmon = find_hp_wmi_hwmon() + if self.hp_hwmon: + try: + with open(os.path.join(self.hp_hwmon, "pwm1_enable"), "w") as f: + f.write(PWM_AUTO) + except OSError as exc: + log("failed to restore AUTO on exit: %s" % exc) + else: + log("no hp-wmi hwmon found; cannot restore AUTO") + + def handle_signal(self, signum, _frame): + log("received signal %d" % signum) + self.running = False + + +def main(): + if os.geteuid() != 0: + log("must run as root (need to write sysfs pwm + /run)") + # Exit non-zero so systemd shows a clear failure. + return 1 + + daemon = Daemon() + signal.signal(signal.SIGTERM, daemon.handle_signal) + signal.signal(signal.SIGINT, daemon.handle_signal) + try: + daemon.run() + except Exception as exc: # noqa: BLE001 + # Absolute last resort: still try to restore AUTO before dying. + log("fatal: %s" % exc) + try: + daemon.shutdown() + except Exception: # noqa: BLE001 + pass + return 1 + return 0 + + +if __name__ == "__main__": + sys.exit(main()) diff --git a/victus-fan/docs/TESTING.md b/victus-fan/docs/TESTING.md new file mode 100644 index 0000000..f3b92b3 --- /dev/null +++ b/victus-fan/docs/TESTING.md @@ -0,0 +1,116 @@ +# Testing & Verification + +`victus-fan` was built and verified **end-to-end on real hardware** before +release. This document records the exact test machine and every check that was +run, so you can reproduce them. + +## Test machine + +| Item | Value | +|--------------|--------------------------------------------------------------| +| Hostname | `jayesh-Victus` | +| Model | HP Victus 15-fb0xxx (*Victus by HP Gaming Laptop*) | +| CPU | AMD Ryzen 5 5600H — sensor `k10temp` (Tctl) | +| iGPU | AMD Radeon Vega (Cezanne) — sensor `amdgpu` (edge) | +| dGPU | NVIDIA GeForce RTX 3050 Mobile — read via `nvidia-smi` | +| OS | **Ubuntu 26.04 LTS** (Resolute Raccoon) | +| Kernel | `7.0.0-22-generic` | +| Secure Boot | **Enabled** | +| Fan driver | stock in-tree `hp-wmi` (`pwm1_enable`, two-state) | +| GNOME Shell | 50 | + +> Because Secure Boot is on and Ubuntu ships no patched `hp-wmi` module, this +> project deliberately uses **only the stock driver** — nothing is compiled, +> signed, or DKMS-built. + +## Results summary + +| # | Check | Result | +|---|-----------------------------------------|--------| +| 1 | Performance mode → fans MAX | ✅ pass | +| 2 | Balanced mode → firmware AUTO when cool | ✅ pass | +| 3 | Power-saver mode → firmware AUTO | ✅ pass | +| 4 | Manual overrides beat policy | ✅ pass | +| 5 | Stop service → fans restored to AUTO | ✅ pass | +| 6 | NVIDIA dGPU never woken to read temp | ✅ pass | +| 7 | Non-root config edits (group model) | ✅ pass | + +## Details + +### 1–3. Power-mode tracking + +The active power profile was cycled with `powerprofilesctl` and the daemon's +effect on `pwm1_enable` was read back from sysfs each time: + +| Power mode (GNOME) | `platform_profile` | Fan state | `pwm1_enable` | Measured fan RPM | +|--------------------|--------------------|-----------|---------------|------------------| +| Performance | `performance` | **MAX** | `0` | ~5400 / ~5200 | +| Balanced | `balanced` | **AUTO** | `2` | follows firmware | +| Power-saver | `quiet` | **AUTO** | `2` | follows firmware | + +(GNOME's *power-saver* maps `platform_profile` to `quiet`, which the daemon's +`profile_map` resolves to the `power-saver` policy.) + +### 4. Manual override (CLI) + +Starting in `performance` (normally forced MAX), each override was applied and +the result read back: + +| Command | `pwm1_enable` | Notes | +|------------------------------|---------------|--------------------------------| +| `victus-fanctl override auto`| `2` | AUTO wins over performance | +| `victus-fanctl override max` | `0` | MAX | +| `victus-fanctl override policy` | `0` | back to performance → MAX | + +### 5. Safety — restore on stop + +`sudo systemctl stop victus-fan` was observed to flip `pwm1_enable` from `0` +back to `2`, with the journal line: + +``` +victus-fand: shutting down; restoring AUTO (pwm1_enable=2) +``` + +So whenever the controller stops (or crashes), the fans are handed back to the +HP firmware curve rather than being left forced. + +### 6. NVIDIA no-wake + +`nvidia-smi` is only invoked when the NVIDIA PCI device reports +`power/runtime_status == active`. When the dGPU is runtime-suspended the status +reports `dgpu: null` / `dgpu_state: "suspended"` and the controller never wakes +it — confirmed by instrumenting the call path. + +### 7. Permission model + +A non-root member of the `victusfan` group was able to change settings: + +```bash +sudo -u jayesh -g victusfan victus-fanctl override auto # succeeded, no root +``` + +and `/etc/victus-fan/config.json` remained group-writable (`-rw-rw-r-- … victusfan`) +afterwards, so the daemon (root) and the user can both manage it. + +## Live example + +A representative `victus-fanctl status` snapshot from the test machine, under +load in performance mode: + +``` +victus-fan status + enabled : True + override : policy + power profile : performance + policy : performance + effective : performance + power source : AC battery 100% + temps : cpu 95.4 C | igpu 54.0 C | dgpu 47.0 C [active] + fan : MAX (fan1 5363 rpm, fan2 5220 rpm, pwm_enable=0) + reason : performance: forced MAX +``` + +## Status + +Installed and **running as a systemd service (`victus-fan.service`) in daily +use** on the machine above. diff --git a/victus-fan/install.sh b/victus-fan/install.sh new file mode 100755 index 0000000..4cd8733 --- /dev/null +++ b/victus-fan/install.sh @@ -0,0 +1,154 @@ +#!/usr/bin/env bash +# +# victus-fan installer +# -------------------- +# Userspace fan controller for an HP Victus 15-fb0xxx on Ubuntu 26.04 +# (kernel 7.0, Secure Boot ON). NO kernel module — stock hp-wmi only. +# +# This script installs: +# daemon -> /usr/lib/victus-fan/victus-fand (root, 0755) +# cli -> /usr/bin/victus-fanctl (0755) +# service -> /etc/systemd/system/victus-fan.service (runs as root) +# tmpfiles-> /etc/tmpfiles.d/victus-fan.conf +# config -> /etc/victus-fan/config.json root:victusfan 0664 (in dir 2775) +# +# It also creates the system group "victusfan" and adds the installing desktop +# user to it so the CLI can atomically replace /etc/victus-fan/config.json. + +set -euo pipefail + +# --------------------------------------------------------------------------- +# Resolve our own location FIRST, before any privilege juggling, so the source +# tree is found no matter where the script is invoked from. +# --------------------------------------------------------------------------- +SRC_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" + +# --------------------------------------------------------------------------- +# Re-exec with sudo if we are not root. We preserve $0 and arguments so the +# re-exec'd copy points back at the same script in the same source tree. +# sudo keeps SUDO_USER set to the invoking user, which is exactly what we use +# below to find the desktop user. +# --------------------------------------------------------------------------- +if [[ "${EUID:-$(id -u)}" -ne 0 ]]; then + echo ">> Re-executing with sudo for root privileges..." + exec sudo "$0" "$@" +fi + +# --------------------------------------------------------------------------- +# Detect the desktop user via SUDO_USER. When the script is run directly as +# root (e.g. in a root shell) SUDO_USER is empty; warn but continue with the +# system-wide parts. +# --------------------------------------------------------------------------- +DESKTOP_USER="${SUDO_USER:-}" +if [[ -z "$DESKTOP_USER" || "$DESKTOP_USER" == "root" ]]; then + echo "!! WARNING: could not resolve a non-root desktop user via SUDO_USER." + echo "!! Group membership and the GNOME extension will be skipped." + echo "!! Re-run with: sudo ./install.sh (from your user session)." + DESKTOP_USER="" +fi + +# Resolve the desktop user's HOME from the password database (do not trust the +# inherited $HOME, which is root's under sudo). +DESKTOP_HOME="" +if [[ -n "$DESKTOP_USER" ]]; then + DESKTOP_HOME="$(getent passwd "$DESKTOP_USER" | cut -d: -f6)" + if [[ -z "$DESKTOP_HOME" ]]; then + echo "!! WARNING: could not resolve home directory for '$DESKTOP_USER'." + fi +fi + +echo ">> Source tree: $SRC_DIR" +echo ">> Desktop user: ${DESKTOP_USER:-}" + +# --------------------------------------------------------------------------- +# 1. Dependencies. The daemon and CLI are pure stdlib python3, which is part of +# a base Ubuntu install; we make sure it is present but never abort if apt is +# momentarily busy — there is nothing else to install. +# --------------------------------------------------------------------------- +echo ">> Ensuring python3 is present ..." +if command -v apt-get >/dev/null 2>&1; then + apt-get install -y python3 || \ + echo "!! WARNING: apt-get failed; python3 is normally already present." +else + echo "!! WARNING: apt-get not found; skipping dependency check." +fi + +# --------------------------------------------------------------------------- +# 2. System group + membership. +# --------------------------------------------------------------------------- +echo ">> Ensuring system group 'victusfan' exists..." +if ! getent group victusfan >/dev/null 2>&1; then + groupadd --system victusfan + echo " created group 'victusfan'." +else + echo " group 'victusfan' already exists." +fi + +if [[ -n "$DESKTOP_USER" ]]; then + echo ">> Adding '$DESKTOP_USER' to group 'victusfan'..." + usermod -aG victusfan "$DESKTOP_USER" +fi + +# --------------------------------------------------------------------------- +# 3. Install program files. +# --------------------------------------------------------------------------- +echo ">> Installing daemon and CLI..." +install -D -m 0755 "$SRC_DIR/daemon/victus-fand" /usr/lib/victus-fan/victus-fand +install -D -m 0755 "$SRC_DIR/daemon/victus-fanctl" /usr/bin/victus-fanctl +install -D -m 0644 "$SRC_DIR/README.md" /usr/share/doc/victus-fan/README.md + +# --------------------------------------------------------------------------- +# 4. Config directory + default config. +# Dir: root:victusfan 2775 (setgid + group-writable) so UIs can atomically +# replace files via rename. Config: root:victusfan 0664, installed ONLY if +# it does not already exist (never clobber a user's tuning). +# --------------------------------------------------------------------------- +echo ">> Setting up /etc/victus-fan ..." +mkdir -p /etc/victus-fan +chown root:victusfan /etc/victus-fan +chmod 2775 /etc/victus-fan + +if [[ ! -e /etc/victus-fan/config.json ]]; then + install -m 0664 "$SRC_DIR/packaging/config.default.json" \ + /etc/victus-fan/config.json + chown root:victusfan /etc/victus-fan/config.json + echo " installed default config.json." +else + echo " keeping existing /etc/victus-fan/config.json (not overwritten)." +fi + +# --------------------------------------------------------------------------- +# 5. systemd unit + tmpfiles runtime dir. +# --------------------------------------------------------------------------- +echo ">> Installing systemd service and tmpfiles config..." +install -D -m 0644 "$SRC_DIR/packaging/victus-fan.service" \ + /etc/systemd/system/victus-fan.service +install -D -m 0644 "$SRC_DIR/packaging/victus-fan.tmpfiles" \ + /etc/tmpfiles.d/victus-fan.conf + +echo ">> Creating runtime directory and (re)loading systemd..." +systemd-tmpfiles --create /etc/tmpfiles.d/victus-fan.conf +systemctl daemon-reload +systemctl enable --now victus-fan.service +echo " victus-fan.service enabled and started." + +# --------------------------------------------------------------------------- +# Done — final guidance. +# --------------------------------------------------------------------------- +cat <> Re-executing with sudo for root privileges..." + exec sudo "$0" "$@" +fi + +# --------------------------------------------------------------------------- +# 1. Stop and disable the service (best effort). +# --------------------------------------------------------------------------- +echo ">> Stopping and disabling victus-fan.service ..." +systemctl disable --now victus-fan.service >/dev/null 2>&1 || true + +# --------------------------------------------------------------------------- +# 2. Hand fans back to firmware: write "2" (AUTO) to the hp-wmi pwm1_enable. +# The daemon normally does this on SIGTERM, but do it here too in case the +# daemon had already died. Discover the hwmon dir generically. +# --------------------------------------------------------------------------- +echo ">> Restoring firmware fan control (pwm1_enable = 2 / AUTO) ..." +restored=0 +for d in /sys/devices/platform/hp-wmi/hwmon/hwmon*/; do + if [[ -w "${d}pwm1_enable" ]]; then + if echo 2 > "${d}pwm1_enable" 2>/dev/null; then + echo " wrote AUTO to ${d}pwm1_enable" + restored=1 + fi + fi +done +[[ "$restored" -eq 0 ]] && echo " (no writable hp-wmi pwm1_enable found; skipping)" + +# --------------------------------------------------------------------------- +# 3. Remove installed program files (idempotent). +# --------------------------------------------------------------------------- +echo ">> Removing installed files ..." +rm -f /usr/lib/victus-fan/victus-fand || true +rmdir /usr/lib/victus-fan 2>/dev/null || true +rm -f /usr/bin/victus-fanctl || true +rm -f /etc/systemd/system/victus-fan.service || true +rm -f /etc/tmpfiles.d/victus-fan.conf || true + +# Reload systemd so the removed unit disappears, and clean the runtime dir. +systemctl daemon-reload 2>/dev/null || true +rm -rf /run/victus-fan 2>/dev/null || true + +# --------------------------------------------------------------------------- +# 4. Leave config + group, with guidance. +# --------------------------------------------------------------------------- +cat <