Feature: Web UI#556
Conversation
Only the outer worker loop blocks on pause, the inner archive does not.
Frontier errataI thought that just returning the StateTable via HTTP will lead to performance issues, especially when you want high-fidelity data CaveatsThe streamed items can be inconsistent depending on when you opened that WebSocket channel (referencing parent IDs which were fan-out before you started your WS). For now this is an necessary evil, if anyone has a nice idea on how we could solve this I would be really glad :) The delta is calculated on item-level only, normally triggering a retransmit whenever the status of the Item changes. You could do this finer, to reduce the bandwidth of each WS. We should also use something more efficient than JSON over-the-wire, CBOR seems like a good choice. Demofrontierdemo-720.mp4(note: the entire Web UI is vibe-coded for now to validate the API. I did not include it into this branch) |

#74 proposed adding a UI to manage Zeno, I came up with the following:
General architecture
My proposal is to heavily extend the already existing API (currently only used by Prometheus), breaking out as many functions as possible.
(Example given: A seed was unsuccessful, we did not find any other pages. A script automatically performs a Google dork search
site:example.com, and adds the newly found URLs back to the reactor. Or you bruteforce known paths, or use the robots.txt + bruteforce).The API can now be configured via
api-static-dirto serve a local directory, you can input thedistfrom Vite, or write barebones.htmlyourself.All API endpoints (and the served files) are unauthenticated, and you can install a authenticator-proxy in front of it
(I used Cloudflare Tunnels, the daemon can be installed on Linux, and you can configure access via SSO, via password or via whitelisted e-mails. You can also use AWS Cognito with CloudFront, or whatever floats your boat)
I propose avoiding RBAC, SSO et cetera to keep this thing as simple as possible, with an API + directory file server everybody can get this up and running fast.
API feature list (to be extended)
Pause/Unpause
Implemented and working, I had to fix a bug in a55719 (the pause would not stop the crawling, only if a new seed begun)
Live-tailed Frontier
You can stream the Reactor StateTable via WebSocket now 🎉
There is a poller which polls the Reactor, computes delta, and fan-out's the StateTable in realtime. More details below.
Add seeds to Reactor
Live-tail logs
Similiar to the Frontier (websocket), conserving the full schema of logs (level and fields) so we can filter through the logs.
WebHook
If you could register a WebHook on e.g. ERROR level logs, you could hook that up to Slack and get notifications whenever there is something wrong with your Zeno instance.