Skip to content

cookbook: add integration_ejentum_cognitive_harness notebook#2994

Open
ejentum wants to merge 2 commits into
langfuse:mainfrom
ejentum:cookbook-ejentum-cognitive-harness
Open

cookbook: add integration_ejentum_cognitive_harness notebook#2994
ejentum wants to merge 2 commits into
langfuse:mainfrom
ejentum:cookbook-ejentum-cognitive-harness

Conversation

@ejentum
Copy link
Copy Markdown

@ejentum ejentum commented May 23, 2026

Summary

Adds a new cookbook notebook under cookbook/integration_ejentum_cognitive_harness.ipynb that traces an Ejentum cognitive harness agent loop in Langfuse. Each turn produces one trace with two child observations:

  • ejentum.call: an @observe-decorated REST wrapper around https://ejentum-main-ab125c3.zuplo.app/logicv1/, with mode, query_length, and scaffold_length attached to the observation metadata.
  • openai.chat.completions.create: from the langfuse.openai drop-in. Prompt, response, model, and token usage are captured automatically.

The pattern generalises beyond Ejentum: any in-loop REST call to a third-party tool can be wrapped the same way. Ejentum is the worked example because the agent calls a single endpoint with a mode argument, which keeps the cookbook short.

File

  • cookbook/integration_ejentum_cognitive_harness.ipynb (new)

Naming and structure match existing cookbook integration entries (integration_agno_agents.ipynb, example_langgraph_agents.ipynb, etc.): one self-contained Jupyter notebook with install + setup + demo + "what to look at" sections.

What the notebook walks through

  1. Install (langfuse, openai, requests).
  2. Configure the Langfuse client + check the four required env vars are set.
  3. Define call_ejentum as an @observe(name=\"ejentum.call\") function with update_current_observation(metadata=...) for the mode and scaffold length.
  4. Use the from langfuse.openai import openai drop-in for auto-instrumented OpenAI calls.
  5. Run a two-turn agent that calls harness_reasoning then harness_anti_deception, splicing the returned scaffold into the system message.
  6. Walks readers through filtering traces by metadata.mode and metadata.scaffold_length in the Langfuse UI.

Affiliation

I maintain the Ejentum harness API. Submitting this as a tracing cookbook because the four-step pattern (drop-in OpenAI client + @observe REST wrapper + metadata on the observation + scaffold-aware system message) is generally useful for any third-party REST tool, and Ejentum is a clean worked example. The notebook explicitly says this generalises. Ejentum has free and paid tiers; the notebook links to the dashboard for keys, not to a checkout.

Test plan

  • Notebook validates as JSON.
  • Mirrors the shape of existing cookbook integration entries.
  • Voice scrubbed: zero em dashes, no marketing language.
  • No new top-level dependencies (only the in-notebook %pip install).
  • Local Jupyter run (cannot run with live Langfuse + OpenAI + Ejentum keys in this environment; happy to follow up if the reviewer spots an issue).

Greptile Summary

Adds a new Jupyter notebook cookbook demonstrating how to trace an Ejentum cognitive-harness REST API call alongside an OpenAI completion in Langfuse, using the @observe decorator and langfuse.openai drop-in.

  • Introduces call_ejentum — an @observe-decorated wrapper that posts to the Ejentum endpoint, parses the scaffold from the JSON response, and attaches mode/query_length/scaffold_length as observation metadata.
  • Defines a harness_augmented_turn function that nests the Ejentum call and an auto-instrumented openai.chat.completions.create inside a single parent trace, then flushes at the end.
  • Documents UI filter patterns for the Traces view that currently refer to observation-level metadata rather than trace-level metadata.

Confidence Score: 3/5

The notebook has two functional defects: an unguarded list index that can raise at runtime, and metadata placed on the wrong Langfuse context level so the documented UI filters don't work.

The scaffold-extraction line payload[0].get(mode, "") raises IndexError on an empty list response rather than returning an empty string, which can happen on quota or validation failures. The cookbook's "Useful filters" instructions point readers to metadata.mode and metadata.scaffold_length in the Traces view, but those keys are attached to the child observation via update_current_observation, not to the root trace — the filter expressions will silently return nothing. Both issues affect the primary demo code and the tutorial steps a reader would follow.

cookbook/integration_ejentum_cognitive_harness.ipynb — the call_ejentum response-parsing block and the metadata-propagation strategy both need changes before this notebook produces the experience it describes.

Sequence Diagram

sequenceDiagram
    participant User
    participant harness_augmented_turn
    participant call_ejentum
    participant EjentumAPI
    participant OpenAI
    participant Langfuse

    User->>harness_augmented_turn: call(task, mode)
    Note over harness_augmented_turn,Langfuse: @observe creates root trace
    harness_augmented_turn->>call_ejentum: call_ejentum(task, mode)
    Note over call_ejentum,Langfuse: @observe creates child span "ejentum.call"
    call_ejentum->>EjentumAPI: "POST /logicv1/ {query, mode}"
    EjentumAPI-->>call_ejentum: "[{mode: scaffold_text}]"
    call_ejentum->>Langfuse: update_current_observation(metadata)
    call_ejentum-->>harness_augmented_turn: scaffold string
    harness_augmented_turn->>OpenAI: chat.completions.create(gpt-4o-mini)
    Note over harness_augmented_turn,Langfuse: langfuse.openai auto-instruments call
    OpenAI-->>harness_augmented_turn: completion
    harness_augmented_turn-->>User: answer text
    User->>Langfuse: langfuse.flush()
Loading
Prompt To Fix All With AI
Fix the following 2 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 2
cookbook/integration_ejentum_cognitive_harness.ipynb:127
**IndexError on empty API response list**

`payload[0]` is accessed without guarding against an empty list. If the Ejentum endpoint returns `[]` (e.g. on a quota or validation error that still yields HTTP 200), this line throws `IndexError` at runtime and the `@observe` span is never closed cleanly. The `isinstance(payload, list)` check only protects against a non-list response, not an empty one.

### Issue 2 of 2
cookbook/integration_ejentum_cognitive_harness.ipynb:128-137
**Trace-level metadata missing — described UI filters will not work**

The notebook instructs readers to filter the Traces view by `metadata.mode` and `metadata.scaffold_length`, but `langfuse_context.update_current_observation(metadata=...)` inside `call_ejentum` attaches those keys to the child span (`ejentum.call`), not to the root trace created by `harness_augmented_turn`. The Traces view in Langfuse filters on trace-level metadata, so both suggested filter expressions will silently return no results. To surface these values at the trace level, `langfuse_context.update_current_trace(metadata={"mode": mode, "scaffold_length": ...})` should be called (from within `call_ejentum` or `harness_augmented_turn`) in addition to the existing observation-level update.

Reviews (1): Last reviewed commit: "cookbook: add integration_ejentum_cognit..." | Re-trigger Greptile

Greptile also left 2 inline comments on this PR.

Adds a cookbook notebook that traces an Ejentum cognitive harness agent
loop in Langfuse. Each turn produces a Langfuse trace with two child
observations: ejentum.call (manual @observe-decorated REST wrapper with
mode and scaffold length on metadata) and openai.chat.completions.create
(from the langfuse.openai drop-in).

The pattern generalises to any third-party REST tool an agent calls
in-loop; Ejentum is the worked example because the agent calls a single
endpoint with a mode arg.
Copy link
Copy Markdown

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code Review

This pull request is from a fork — automated review is disabled. A repository maintainer can comment @claude review to run a one-time review.

@review-notebook-app
Copy link
Copy Markdown

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@vercel
Copy link
Copy Markdown

vercel Bot commented May 23, 2026

@ejentum is attempting to deploy a commit to the langfuse Team on Vercel.

A member of the Team first needs to authorize it.

@CLAassistant
Copy link
Copy Markdown

CLAassistant commented May 23, 2026

CLA assistant check
All committers have signed the CLA.

@dosubot dosubot Bot added size:M This PR changes 30-99 lines, ignoring generated files. documentation Improvements or additions to documentation labels May 23, 2026
"\n",
"Each turn does the same shape:\n",
"\n",
"1. `call_ejentum(...)` (Langfuse observation with mode + scaffold length).\n",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 IndexError on empty API response list

payload[0] is accessed without guarding against an empty list. If the Ejentum endpoint returns [] (e.g. on a quota or validation error that still yields HTTP 200), this line throws IndexError at runtime and the @observe span is never closed cleanly. The isinstance(payload, list) check only protects against a non-list response, not an empty one.

Prompt To Fix With AI
This is a comment left during a code review.
Path: cookbook/integration_ejentum_cognitive_harness.ipynb
Line: 127

Comment:
**IndexError on empty API response list**

`payload[0]` is accessed without guarding against an empty list. If the Ejentum endpoint returns `[]` (e.g. on a quota or validation error that still yields HTTP 200), this line throws `IndexError` at runtime and the `@observe` span is never closed cleanly. The `isinstance(payload, list)` check only protects against a non-list response, not an empty one.

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +128 to +137
"2. `openai.chat.completions.create(...)` from the auto-instrumented client.\n",
"\n",
"Both observations sit inside a single trace per turn, so you can drill from the trace summary into the harness retrieval, the system message it produced, and the model call that consumed it."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Trace-level metadata missing — described UI filters will not work

The notebook instructs readers to filter the Traces view by metadata.mode and metadata.scaffold_length, but langfuse_context.update_current_observation(metadata=...) inside call_ejentum attaches those keys to the child span (ejentum.call), not to the root trace created by harness_augmented_turn. The Traces view in Langfuse filters on trace-level metadata, so both suggested filter expressions will silently return no results. To surface these values at the trace level, langfuse_context.update_current_trace(metadata={"mode": mode, "scaffold_length": ...}) should be called (from within call_ejentum or harness_augmented_turn) in addition to the existing observation-level update.

Prompt To Fix With AI
This is a comment left during a code review.
Path: cookbook/integration_ejentum_cognitive_harness.ipynb
Line: 128-137

Comment:
**Trace-level metadata missing — described UI filters will not work**

The notebook instructs readers to filter the Traces view by `metadata.mode` and `metadata.scaffold_length`, but `langfuse_context.update_current_observation(metadata=...)` inside `call_ejentum` attaches those keys to the child span (`ejentum.call`), not to the root trace created by `harness_augmented_turn`. The Traces view in Langfuse filters on trace-level metadata, so both suggested filter expressions will silently return no results. To surface these values at the trace level, `langfuse_context.update_current_trace(metadata={"mode": mode, "scaffold_length": ...})` should be called (from within `call_ejentum` or `harness_augmented_turn`) in addition to the existing observation-level update.

How can I resolve this? If you propose a fix, please make it concise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation size:M This PR changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants