Skip to content

Commit b7f3733

Browse files
Darth-Hidiousclaude
andcommitted
docs: update WORKFLOW_GUIDE with hooks, conditionals, parallel, nesting, retries, obligations
Complete reference for all 8 step types, 3 hooks, OPA obligations, retry behavior, context chaining for parallel/nested workflows, and a full pipeline example using every feature. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent 4f9eb10 commit b7f3733

1 file changed

Lines changed: 306 additions & 9 deletions

File tree

docs/WORKFLOW_GUIDE.md

Lines changed: 306 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -194,6 +194,10 @@ description: What it does. # Shown in prism workflow list
194194
default_mode: dry_run # "dry_run" or "execute"
195195
arguments: [...] # CLI arguments
196196
steps: [...] # DAG of execution steps
197+
hooks: # Optional lifecycle hooks
198+
on_start: [...] # Steps to run before the workflow
199+
on_complete: [...] # Steps to run after success
200+
on_error: [...] # Steps to run on failure
197201
```
198202
199203
### Arguments
@@ -271,6 +275,177 @@ Calls `POST http://127.0.0.1:7327/api/tools/{name}/run` on the local node. Requi
271275

272276
**OPA policy is checked per tool step** — see Security section below.
273277

278+
#### `if` — Conditional branching
279+
280+
```yaml
281+
- id: check_results
282+
action: if
283+
condition: "{{ search.body.count }}" # Truthy check
284+
then:
285+
- id: found
286+
action: message
287+
text: "Found {{ search.body.count }} materials"
288+
- id: process
289+
action: tool
290+
name: predict_properties
291+
inputs:
292+
dataset: "{{ search.body }}"
293+
else:
294+
- id: not_found
295+
action: message
296+
text: "No materials found — trying broader search"
297+
```
298+
299+
**Truthy values:** non-empty string (except `"false"`, `"0"`, `"null"`), non-zero number, non-empty array/object, `true`.
300+
301+
**Falsy values:** empty string, `"false"`, `"0"`, `"null"`, `0`, `null`, empty array/object.
302+
303+
Sub-steps in `then`/`else` can be any step type (`set`, `message`, `http`, `tool`). Context set by sub-steps is available to subsequent workflow steps.
304+
305+
Stored in context as `{{ step_id.branch }}` ("then" or "else") and `{{ step_id.condition }}`.
306+
307+
#### `parallel` — Concurrent execution
308+
309+
```yaml
310+
- id: multi_search
311+
action: parallel
312+
steps:
313+
- id: search_mp
314+
action: http
315+
method: GET
316+
url: "https://api.materialsproject.org/materials?formula={{ formula }}"
317+
- id: search_nomad
318+
action: http
319+
method: GET
320+
url: "https://nomad-lab.eu/api/v1/entries?formula={{ formula }}"
321+
- id: search_graph
322+
action: http
323+
method: GET
324+
url: "{{ platform_api_base }}/knowledge/graph/search?q={{ formula }}"
325+
```
326+
327+
All sub-steps execute concurrently via `tokio::spawn`. Each sub-step's context is merged back into the parent. In dry run, shows the plan without executing.
328+
329+
Stored in context as `{{ step_id.completed }}` (count) and `{{ step_id.steps }}` (list of sub-step IDs).
330+
331+
**Note:** Sub-steps run independently — they cannot reference each other's output. Use `parallel` for fan-out queries, not for dependent chains.
332+
333+
#### `workflow` — Call a sub-workflow
334+
335+
```yaml
336+
- id: train_model
337+
action: workflow
338+
name: forge # Must exist in discovery paths
339+
inputs:
340+
paper: "{{ paper }}"
341+
dataset: "{{ candidates }}"
342+
target: "local"
343+
```
344+
345+
Recursively executes another workflow with its own arguments, steps, and policy checks. The child workflow's full result (context + steps) is stored under the step ID.
346+
347+
Access child results: `{{ train_model.workflow }}`, `{{ train_model.steps }}`, `{{ train_model.context.variable }}`.
348+
349+
**OPA policy:** The child workflow gets its own `workflow.execute` policy check. If the child is denied, the parent aborts.
350+
351+
#### Retries — on any step
352+
353+
Any step can have retry configuration:
354+
355+
```yaml
356+
- id: flaky_api
357+
action: http
358+
method: GET
359+
url: "https://unreliable-api.example.com/data"
360+
retries: 3 # Retry up to 3 times on failure
361+
retry_delay_secs: 2 # Base delay (multiplied by attempt number)
362+
expect_status: [200]
363+
```
364+
365+
Retry behavior:
366+
- Attempt 0: immediate
367+
- Attempt 1: wait `retry_delay_secs * 1` seconds
368+
- Attempt 2: wait `retry_delay_secs * 2` seconds
369+
- If all attempts fail, the workflow aborts with the last error
370+
- Dry run mode never retries
371+
372+
Works on all step types: `set`, `message`, `http`, `tool`, `if`, `parallel`, `workflow`.
373+
374+
---
375+
376+
## Hooks
377+
378+
Hooks run before and after the main workflow steps. They're defined at the top level of the YAML:
379+
380+
```yaml
381+
hooks:
382+
on_start:
383+
- id: notify_start
384+
action: http
385+
method: POST
386+
url: "https://slack.example.com/webhook"
387+
body:
388+
text: "Workflow {{ workflow_name }} starting"
389+
- id: log_start
390+
action: message
391+
text: "Starting {{ workflow_name }} at {{ now_iso }}"
392+
393+
on_complete:
394+
- id: notify_done
395+
action: http
396+
method: POST
397+
url: "https://slack.example.com/webhook"
398+
body:
399+
text: "Workflow {{ workflow_name }} completed"
400+
401+
on_error:
402+
- id: notify_fail
403+
action: message
404+
text: "Workflow {{ workflow_name }} failed"
405+
```
406+
407+
| Hook | When it runs | Context available |
408+
|------|-------------|-------------------|
409+
| `on_start` | Before any step, after argument resolution | Arguments + builtins only |
410+
| `on_complete` | After all steps succeed | Full context including all step outputs |
411+
| `on_error` | When any step fails (planned, not yet wired) | Context up to the failed step |
412+
413+
Hook steps can be `set`, `message`, or `http`. Hook failures are logged but don't abort the workflow.
414+
415+
---
416+
417+
## OPA Obligations
418+
419+
When OPA policy allows a workflow, it may also return **obligations** — things the system must do as a side effect.
420+
421+
### Built-in obligations
422+
423+
| Obligation | Trigger | Effect |
424+
|------------|---------|--------|
425+
| `audit_log` | Any `workflow.execute` action | Logs workflow start/complete via `tracing::info` with workflow name, principal, and timestamps |
426+
| `notify_admin` | Agent role executing a workflow | Emits `tracing::warn` so admin dashboards/alerting can pick it up |
427+
428+
### Custom obligations
429+
430+
Add custom obligations in your `.rego` policy:
431+
432+
```rego
433+
package prism.policy
434+
435+
# Require cost approval for expensive workflows
436+
obligations contains "cost_approval" if {
437+
input.action == "workflow.execute"
438+
input.context.step_count > 10
439+
}
440+
441+
# Require audit for any agent action
442+
obligations contains "audit_log" if {
443+
input.principal == "agent"
444+
}
445+
```
446+
447+
Obligations are returned in the `PolicyDecision.obligations` field and logged/acted on by the workflow engine. Custom obligation handlers can be added in the Rust workflow engine as needed.
448+
274449
---
275450

276451
## Template Engine
@@ -447,30 +622,39 @@ prism explore --space "Ni-Cr-Co" --target "hardness > 400" --execute
447622

448623
### Dry run vs Execute
449624

450-
| Mode | `set` steps | `message` steps | `http` steps | `tool` steps |
451-
|------|-------------|-----------------|--------------|--------------|
452-
| `dry_run` | Context updated, status=`planned` | Text rendered, status=`planned` | URL shown but **not called**, status=`planned` | Tool shown but **not called**, status=`planned` |
453-
| `execute` | Context updated, status=`completed` | Text rendered, status=`completed` | HTTP call made, response stored, status=`completed` | Tool called via node API, response stored, status=`completed` |
625+
| Mode | `set` | `message` | `http` | `tool` | `if` | `parallel` | `workflow` |
626+
|------|-------|-----------|--------|--------|------|------------|------------|
627+
| `dry_run` | Context updated, `planned` | Text rendered, `planned` | URL shown, **not called** | Tool shown, **not called** | Condition evaluated, branch shown | Steps listed, **not called** | Child shown, **not called** |
628+
| `execute` | Context updated, `completed` | Text rendered, `completed` | HTTP called, response stored | Tool called via node API | Branch executed, sub-steps run | All sub-steps run concurrently | Child workflow fully executed |
454629

455630
### Error handling
456631

457-
- If an `http` step returns a status not in `expect_status`, the workflow **aborts**
458-
- If a `tool` step returns HTTP 4xx/5xx, the workflow **aborts**
632+
- If an `http` step returns a status not in `expect_status`, the workflow **aborts** (unless `retries` is set)
633+
- If a `tool` step returns HTTP 4xx/5xx, the workflow **aborts** (unless `retries` is set)
459634
- If a template variable doesn't exist in context, the workflow **aborts** with `unknown workflow context path`
460635
- If a required argument is missing, the workflow **aborts** before any step runs
461636
- OPA deny → workflow **aborts** with the deny message
637+
- `retries` → retries with exponential backoff before aborting
638+
- `parallel` → if any sub-step fails, the entire parallel step fails
639+
- `workflow` → if the child workflow fails, the parent aborts
640+
- Hook failures are **logged but do not abort** the workflow
462641

463642
### Context lifetime
464643

465644
Context lives for the duration of the workflow run. Each step adds to it:
466645

467646
```
468-
Initial context (args + env + defaults + builtins)
647+
on_start hooks (can set context)
469648
└─ step 1 output added
470-
└─ step 2 output added
471-
└─ step 3 can read from step 1, step 2, and all args
649+
└─ step 2 output added (if/parallel/workflow sub-steps also add)
650+
└─ step 3 can read from step 1, step 2, all args, and hook context
651+
on_complete hooks (can read full context)
472652
```
473653

654+
For `parallel` steps, each sub-step runs with a snapshot of the current context. Sub-step outputs are merged back — if two sub-steps set the same key, last-to-finish wins.
655+
656+
For `workflow` (nesting), the child gets its own context built from `inputs`. The child's full result is stored under the parent step ID — access with `{{ step_id.context.variable }}`.
657+
474658
---
475659

476660
## Putting It Together: GFlowNet Exploration
@@ -498,3 +682,116 @@ The complete flow for adding a new GFlowNet exploration capability:
498682
```
499683

500684
No Rust code changes. No CLI modifications. No compilation. The workflow engine handles discovery, argument parsing, template rendering, OPA policy, HTTP calls, tool dispatch, and result reporting.
685+
686+
---
687+
688+
## Complete Example: All Features
689+
690+
A workflow that uses every engine feature:
691+
692+
```yaml
693+
api_version: prism/v1
694+
kind: workflow
695+
name: full-pipeline
696+
command_name: pipeline
697+
description: End-to-end materials discovery with all engine features.
698+
699+
arguments:
700+
- name: formula
701+
type: string
702+
required: true
703+
help: Chemical formula to investigate, e.g. NiCrCoAlTi
704+
- name: target
705+
type: string
706+
default: yield_strength
707+
help: Property to optimize
708+
709+
hooks:
710+
on_start:
711+
- id: h_start
712+
action: message
713+
text: "Pipeline starting for {{ formula }} targeting {{ target }}"
714+
on_complete:
715+
- id: h_done
716+
action: http
717+
method: POST
718+
url: "https://hooks.example.com/notify"
719+
body:
720+
text: "Pipeline for {{ formula }} completed"
721+
722+
steps:
723+
# 1. Search multiple databases in parallel
724+
- id: search
725+
action: parallel
726+
steps:
727+
- id: graph
728+
action: http
729+
method: GET
730+
url: "https://api.marc27.com/api/v1/knowledge/graph/search?q={{ formula }}"
731+
- id: semantic
732+
action: http
733+
method: POST
734+
url: "https://api.marc27.com/api/v1/knowledge/search"
735+
body:
736+
query: "{{ formula }} {{ target }}"
737+
738+
# 2. Check if we found anything
739+
- id: check
740+
action: if
741+
condition: "{{ graph.body }}"
742+
then:
743+
- id: found
744+
action: message
745+
text: "Found existing data — enriching with predictions"
746+
else:
747+
- id: not_found
748+
action: message
749+
text: "No existing data — starting from scratch"
750+
751+
# 3. Run GFlowNet exploration (retries for flaky GPU)
752+
- id: explore
753+
action: http
754+
method: POST
755+
url: "https://api.marc27.com/api/v1/compute/submit"
756+
retries: 2
757+
retry_delay_secs: 5
758+
body:
759+
image: "marc27/gflownet:latest"
760+
name: "explore-{{ formula }}"
761+
inputs:
762+
formula: "{{ formula }}"
763+
target: "{{ target }}"
764+
765+
# 4. Run the forge sub-workflow to train a model
766+
- id: train
767+
action: workflow
768+
name: forge
769+
inputs:
770+
paper: "gflownet-output"
771+
dataset: "{{ formula }}"
772+
target: "local"
773+
774+
# 5. Report
775+
- id: report
776+
action: message
777+
text: "Pipeline complete: explored {{ formula }}, job={{ explore.body.job_id }}, model trained"
778+
```
779+
780+
```bash
781+
prism pipeline --formula NiCrCoAlTi --target yield_strength --execute
782+
```
783+
784+
---
785+
786+
## Step Type Summary
787+
788+
| Step | Purpose | OPA check | Context output |
789+
|------|---------|-----------|---------------|
790+
| `set` | Set variables | No | `{{ step_id.key }}` |
791+
| `message` | Display text | No | `{{ step_id.message }}` |
792+
| `http` | Call any API | No | `{{ step_id.body }}`, `{{ step_id.status_code }}` |
793+
| `tool` | Call PRISM tool | **Yes** (per-tool) | `{{ step_id.output }}` |
794+
| `if` | Branch on condition | No | `{{ step_id.branch }}`, `{{ step_id.condition }}` |
795+
| `parallel` | Fan-out concurrent | No | `{{ step_id.completed }}`, `{{ step_id.steps }}` |
796+
| `workflow` | Call sub-workflow | **Yes** (child gets own check) | `{{ step_id.context.* }}`, `{{ step_id.steps }}` |
797+
| `retries` | Retry any step | Inherited | Same as wrapped step |

0 commit comments

Comments
 (0)