Skip to content

Commit c4dad2b

Browse files
author
Chibi Vikram
committed
fix: ensure LangGraph nodes return dicts for proper checkpoint serialization
When using Pydantic BaseModel for State schema, nodes must return plain dicts rather than BaseModel instances. LangGraph cannot properly serialize BaseModel objects to checkpoints, resulting in empty state and validation errors during resume. Changes: - graph_simple.py: Return dict instead of State object from suspend_node - runtime.py: Add Command(resume=...) wrapper for resume mode - Add test_resume_direct.py: Test script for direct resume without API - Add test_full_cycle.py: Interactive test for full suspend/resume cycle - Add SUSPEND_RESUME_FIX.md: Detailed documentation of the fix - Add MANUAL_TEST_GUIDE.md: Step-by-step manual testing guide - Update README.md: Document new test files and evaluation setup - Update uipath.json: Add agent-simple entrypoint for eval This enables proper suspend/resume functionality for evaluations with RPA invocations, allowing agents to persist state across process restarts.
1 parent d0efd99 commit c4dad2b

9 files changed

Lines changed: 548 additions & 22 deletions

File tree

Lines changed: 214 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,214 @@
1+
# Manual Testing Guide for Suspend/Resume
2+
3+
This guide shows how to manually test the suspend/resume functionality using CLI commands.
4+
5+
## Step 1: Initial Execution (Suspend Phase)
6+
7+
Run the agent - it will suspend at the `interrupt()` call:
8+
9+
```bash
10+
uv run uipath run agent-simple --input '{"query": "test manual suspend"}'
11+
```
12+
13+
Expected output:
14+
```
15+
Status: SUSPENDED
16+
Output: {
17+
'abc123...': {
18+
'message': 'Waiting for external completion',
19+
'query': 'test manual suspend'
20+
}
21+
}
22+
```
23+
24+
The key here is the **interrupt_id** (the long hash like `abc123...`). This is needed for resume.
25+
26+
## Step 2: Inspect What Was Saved
27+
28+
### Check the checkpoint:
29+
```bash
30+
uv run python -c "
31+
from langgraph.checkpoint.sqlite.aio import AsyncSqliteSaver
32+
from graph_simple import builder
33+
import asyncio
34+
35+
async def check():
36+
async with AsyncSqliteSaver.from_conn_string('__uipath/state.db') as saver:
37+
graph = builder.compile(checkpointer=saver)
38+
state = await graph.aget_state({'configurable': {'thread_id': 'default'}})
39+
print('State values:', state.values)
40+
print('Next tasks:', state.next)
41+
42+
asyncio.run(check())
43+
"
44+
```
45+
46+
### Check triggers in database:
47+
```bash
48+
sqlite3 __uipath/state.db "SELECT runtime_id, interrupt_id FROM __uipath_resume_triggers"
49+
```
50+
51+
## Step 3: Resume Execution
52+
53+
### Option A: Using CLI Resume (If Available)
54+
55+
```bash
56+
# If the uipath CLI supports resume with data:
57+
uv run uipath resume agent-simple \
58+
--thread-id default \
59+
--resume-data '{"<interrupt_id>": "MY RESUME DATA"}'
60+
```
61+
62+
Replace `<interrupt_id>` with the actual interrupt ID from Step 1.
63+
64+
### Option B: Using Python Script (Recommended)
65+
66+
Create a resume script:
67+
68+
```bash
69+
cat > test_manual_resume.py << 'EOF'
70+
import asyncio
71+
from uipath.runtime import UiPathRuntimeContext, UiPathExecuteOptions
72+
from uipath_langchain.runtime.factory import UiPathLangGraphRuntimeFactory
73+
74+
async def main():
75+
# Prompt user for interrupt_id
76+
print("Enter the interrupt_id from the suspend output:")
77+
interrupt_id = input("> ").strip()
78+
79+
print("\nEnter the data you want to provide for resume:")
80+
resume_data_value = input("> ").strip()
81+
82+
# Create runtime
83+
ctx = UiPathRuntimeContext()
84+
factory = UiPathLangGraphRuntimeFactory(ctx)
85+
runtime = await factory.new_runtime(entrypoint="agent-simple", runtime_id="default")
86+
87+
# Resume with provided data
88+
resume_input = {interrupt_id: resume_data_value}
89+
options = UiPathExecuteOptions(resume=True)
90+
91+
print(f"\nResuming with data: {resume_input}")
92+
result = await runtime.execute(input=resume_input, options=options)
93+
94+
print(f"\n✅ Status: {result.status}")
95+
print(f"Output: {result.output}")
96+
97+
await factory.dispose()
98+
99+
if __name__ == "__main__":
100+
asyncio.run(main())
101+
EOF
102+
103+
uv run python test_manual_resume.py
104+
```
105+
106+
Example interaction:
107+
```
108+
Enter the interrupt_id from the suspend output:
109+
> abc123def456...
110+
111+
Enter the data you want to provide for resume:
112+
> Completed by manual testing
113+
114+
✅ Status: SUCCESSFUL
115+
Output: {'query': 'test manual suspend', 'result': 'Completed with resume data: Completed by manual testing'}
116+
```
117+
118+
## Step 4: Verify Final State
119+
120+
Check that the execution completed:
121+
122+
```bash
123+
uv run python -c "
124+
from langgraph.checkpoint.sqlite.aio import AsyncSqliteSaver
125+
from graph_simple import builder
126+
import asyncio
127+
128+
async def check():
129+
async with AsyncSqliteSaver.from_conn_string('__uipath/state.db') as saver:
130+
graph = builder.compile(checkpointer=saver)
131+
state = await graph.aget_state({'configurable': {'thread_id': 'default'}})
132+
print('Final state:', state.values)
133+
print('Next tasks:', state.next) # Should be empty
134+
135+
asyncio.run(check())
136+
"
137+
```
138+
139+
Expected output:
140+
```
141+
Final state: {'query': 'test manual suspend', 'result': 'Completed with resume data: Completed by manual testing'}
142+
Next tasks: ()
143+
```
144+
145+
## Full End-to-End Test Script
146+
147+
For convenience, here's a complete script that does both phases:
148+
149+
```bash
150+
cat > test_full_cycle.py << 'EOF'
151+
import asyncio
152+
from uipath.runtime import UiPathRuntimeContext, UiPathExecuteOptions
153+
from uipath_langchain.runtime.factory import UiPathLangGraphRuntimeFactory
154+
155+
async def main():
156+
ctx = UiPathRuntimeContext()
157+
factory = UiPathLangGraphRuntimeFactory(ctx)
158+
runtime = await factory.new_runtime(entrypoint="agent-simple", runtime_id="manual_test")
159+
160+
print("=" * 80)
161+
print("PHASE 1: Execute and Suspend")
162+
print("=" * 80)
163+
164+
result1 = await runtime.execute(input={"query": "test full cycle"})
165+
print(f"Status: {result1.status}")
166+
print(f"Interrupts: {result1.output}")
167+
168+
if result1.status.name != "SUSPENDED":
169+
print("ERROR: Expected SUSPENDED status")
170+
return
171+
172+
interrupt_id = list(result1.output.keys())[0]
173+
print(f"\n✓ Got interrupt_id: {interrupt_id[:16]}...")
174+
175+
print("\n" + "=" * 80)
176+
print("PHASE 2: Resume")
177+
print("=" * 80)
178+
179+
user_data = input("Enter data to provide for resume (or press Enter for default): ").strip()
180+
if not user_data:
181+
user_data = "Manual test completed"
182+
183+
resume_input = {interrupt_id: user_data}
184+
options = UiPathExecuteOptions(resume=True)
185+
result2 = await runtime.execute(input=resume_input, options=options)
186+
187+
print(f"\n✅ Status: {result2.status}")
188+
print(f"Final output: {result2.output}")
189+
190+
await factory.dispose()
191+
192+
if __name__ == "__main__":
193+
asyncio.run(main())
194+
EOF
195+
196+
uv run python test_full_cycle.py
197+
```
198+
199+
## Common Issues
200+
201+
### Issue: "No checkpoint found"
202+
- Make sure you're using the same `thread_id` / `runtime_id` for both suspend and resume
203+
- Default is `"default"` for `uipath run`
204+
205+
### Issue: "Field required" validation error
206+
- This was the bug we just fixed - make sure `graph_simple.py` returns a dict, not a State object
207+
208+
### Issue: "No triggers found in database"
209+
- Triggers might have been deleted by a previous failed resume attempt
210+
- Re-run the suspend phase (Step 1)
211+
212+
### Issue: Empty resume data
213+
- Make sure you're providing the correct interrupt_id from the suspend output
214+
- The interrupt_id is the key in the output dict from the suspend phase

samples/tool-calling-suspend-resume/README.md

Lines changed: 41 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -101,10 +101,14 @@ async def invoke_process_node(state: State) -> State:
101101
- **`test_suspend_step2.py`** - Resume step (can be run separately)
102102
- **`inspect_state.py`** - Utility to decode and inspect checkpoint database
103103

104-
### Configuration
104+
### Configuration & Evaluations
105105
- **`pyproject.toml`** - Python dependencies
106106
- **`uipath.json`** - Agent configuration
107-
- **`evaluations/`** - Evaluation sets for testing suspend/resume behavior
107+
- **`langgraph.json`** - Graph definitions (graph and agent-simple)
108+
- **`evaluations/`** - Evaluation framework for validating suspend/resume
109+
- `eval-sets/test_simple_no_auth.json` - Test cases for suspend/resume validation
110+
- `evaluators/resume-completed-evaluator.json` - Contains evaluator checking completion text
111+
- `evaluators/suspend-resume-trajectory-evaluator.json` - LLM judge for trajectory validation
108112

109113
## Running the Demo
110114

@@ -133,24 +137,50 @@ uv run python demo_suspend_resume.py resume
133137

134138
## Using with UiPath Evaluation Runtime
135139

136-
Test with the evaluation runtime to see how triggers are extracted:
140+
Test with the evaluation runtime to see how suspend/resume is validated:
137141

138142
```bash
139-
# Simple variant (no authentication required)
140-
uv run uipath eval agent-simple evaluations/eval-sets/test_simple_no_auth.json
143+
# IMPORTANT: Clean previous checkpoint state first!
144+
rm -rf __uipath/state.db
141145

142-
# Full RPA variant (requires authentication)
143-
uv run uipath eval graph evaluations/eval-sets/test_suspend_resume.json
146+
# Run evaluation - agent will suspend
147+
uv run uipath eval agent-simple evaluations/eval-sets/test_simple_no_auth.json
144148
```
145149

150+
**What this tests**:
151+
- Agent executes and calls `interrupt()` → suspends
152+
- Evaluation runtime detects SUSPENDED status
153+
- Triggers are extracted (API resume triggers with inbox IDs)
154+
- Evaluators are skipped during suspend (they run after resume)
155+
- Status propagates to orchestrator
156+
- Checkpoint saved to `__uipath/state.db` (40KB)
157+
146158
Expected output:
147159
```
148-
🔴 DETECTED SUSPENSION → Runtime detects SUSPENDED status
149-
📋 Extracted 1 trigger(s) → Shows InvokeProcess trigger details
150-
⏭️ Skipping evaluators → Evaluators run after resume
151-
✅ Result: SUSPENDED with triggers
160+
EVAL RUNTIME: Resume mode: False
161+
🔴 EVAL RUNTIME: DETECTED SUSPENSION
162+
EVAL RUNTIME: Agent returned SUSPENDED status
163+
EVAL RUNTIME: Extracted 2 trigger(s) from suspended execution
164+
EVAL RUNTIME: Propagating SUSPENDED status from inner runtime
165+
✓ Basic suspend/resume with query - No evaluators
152166
```
153167

168+
**Note**: The `--resume` flag exists in the eval command, but the full resume flow (providing resume data to `interrupt()`) is handled by the orchestrator in production. For local testing of the complete suspend/resume cycle, use `demo_suspend_resume.py`.
169+
170+
### Evaluators
171+
172+
The sample includes two evaluators that validate suspend/resume behavior:
173+
174+
**`ResumeCompletedEvaluator`** (ContainsEvaluator)
175+
- Checks that the result contains "Completed with resume data"
176+
- Validates the agent completed successfully after resume
177+
178+
**`SuspendResumeTrajectoryEvaluator`** (LLM Judge)
179+
- Uses GPT-4 to evaluate the entire suspend/resume trajectory
180+
- Assesses whether the agent properly suspended and resumed
181+
182+
These evaluators run AFTER resume completes, not during the suspend phase.
183+
154184
## Inspecting the State Database
155185

156186
Want to see what's stored in the checkpoint?

0 commit comments

Comments
 (0)