-
Notifications
You must be signed in to change notification settings - Fork 237
cookbook: add integration_ejentum_cognitive_harness notebook #2994
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from 1 commit
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,210 @@ | ||
| { | ||
| "cells": [ | ||
| { | ||
| "cell_type": "markdown", | ||
| "metadata": {}, | ||
| "source": [ | ||
| "# Tracing an Ejentum cognitive harness agent loop in Langfuse\n", | ||
| "\n", | ||
| "This cookbook shows how to instrument an agent that calls the [Ejentum cognitive harness](https://ejentum.com) REST API between an LLM call and its response, so each turn shows up in Langfuse as a nested trace with the harness retrieval, the model call, and their token usage.\n", | ||
| "\n", | ||
| "The pattern is reusable. Any in-loop REST call to a third-party tool can be wrapped the same way; Ejentum is the worked example because the agent calls one endpoint with a `mode` argument, which keeps the cookbook short.\n", | ||
| "\n", | ||
| "## What you'll set up\n", | ||
| "\n", | ||
| "1. The Langfuse client and a project to send traces to.\n", | ||
| "2. The `from langfuse.openai import openai` drop-in so OpenAI calls are auto-instrumented.\n", | ||
| "3. An `@observe`-decorated function around each Ejentum REST call so the harness retrieval appears as a child observation with the `mode` and a scaffold length attached.\n", | ||
| "4. A two-turn agent that calls `harness_reasoning` then `harness_anti_deception` and answers a reasoning-heavy task each turn.\n", | ||
| "\n", | ||
| "Open the trace in your Langfuse project to inspect the tree per turn." | ||
| ] | ||
| }, | ||
| { | ||
| "cell_type": "markdown", | ||
| "metadata": {}, | ||
| "source": [ | ||
| "## 1. Install" | ||
| ] | ||
| }, | ||
| { | ||
| "cell_type": "code", | ||
| "execution_count": null, | ||
| "metadata": {}, | ||
| "outputs": [], | ||
| "source": [ | ||
| "%pip install -q langfuse openai requests" | ||
| ] | ||
| }, | ||
| { | ||
| "cell_type": "markdown", | ||
| "metadata": {}, | ||
| "source": [ | ||
| "## 2. Configure Langfuse\n", | ||
| "\n", | ||
| "Set `LANGFUSE_PUBLIC_KEY`, `LANGFUSE_SECRET_KEY`, and `LANGFUSE_HOST` from your Langfuse project. The host defaults to `https://cloud.langfuse.com`; self-hosted users should set this to their instance URL." | ||
| ] | ||
| }, | ||
| { | ||
| "cell_type": "code", | ||
| "execution_count": null, | ||
| "metadata": {}, | ||
| "outputs": [], | ||
| "source": [ | ||
| "import os\n", | ||
| "\n", | ||
| "# Replace with your project's keys, or load from a .env file.\n", | ||
| "os.environ.setdefault(\"LANGFUSE_HOST\", \"https://cloud.langfuse.com\")\n", | ||
| "assert os.environ.get(\"LANGFUSE_PUBLIC_KEY\"), \"LANGFUSE_PUBLIC_KEY is not set.\"\n", | ||
| "assert os.environ.get(\"LANGFUSE_SECRET_KEY\"), \"LANGFUSE_SECRET_KEY is not set.\"\n", | ||
| "assert os.environ.get(\"OPENAI_API_KEY\"), \"OPENAI_API_KEY is not set.\"\n", | ||
| "assert os.environ.get(\"EJENTUM_API_KEY\"), \"EJENTUM_API_KEY is not set. Get one at https://ejentum.com/dashboard\"\n", | ||
| "\n", | ||
| "from langfuse import Langfuse\n", | ||
| "langfuse = Langfuse()\n", | ||
| "print(\"Langfuse project URL:\", os.environ[\"LANGFUSE_HOST\"])" | ||
| ] | ||
| }, | ||
| { | ||
| "cell_type": "markdown", | ||
| "metadata": {}, | ||
| "source": [ | ||
| "## 3. Auto-instrument OpenAI and wrap the Ejentum REST call\n", | ||
| "\n", | ||
| "Langfuse ships a drop-in for OpenAI (`from langfuse.openai import openai`). It captures the request, response, model, and token usage automatically. For Ejentum we add a thin `@observe`-decorated wrapper so each harness retrieval shows up as a child observation in the same trace, with the `mode` and scaffold length surfaced for filtering." | ||
| ] | ||
| }, | ||
| { | ||
| "cell_type": "code", | ||
| "execution_count": null, | ||
| "metadata": {}, | ||
| "outputs": [], | ||
| "source": [ | ||
| "import requests\n", | ||
| "from langfuse.decorators import langfuse_context, observe\n", | ||
| "from langfuse.openai import openai\n", | ||
| "\n", | ||
| "EJENTUM_URL = \"https://ejentum-main-ab125c3.zuplo.app/logicv1/\"\n", | ||
| "\n", | ||
| "\n", | ||
| "@observe(name=\"ejentum.call\")\n", | ||
| "def call_ejentum(query: str, mode: str = \"reasoning\") -> str:\n", | ||
| " \"\"\"Fetch a cognitive scaffold from the Ejentum REST gateway.\n", | ||
| "\n", | ||
| " Emits a Langfuse observation with input, output, and metadata\n", | ||
| " (mode, query_length, scaffold_length).\n", | ||
| " \"\"\"\n", | ||
| " r = requests.post(\n", | ||
| " EJENTUM_URL,\n", | ||
| " headers={\n", | ||
| " \"Authorization\": f\"Bearer {os.environ['EJENTUM_API_KEY']}\",\n", | ||
| " \"Content-Type\": \"application/json\",\n", | ||
| " },\n", | ||
| " json={\"query\": query, \"mode\": mode},\n", | ||
| " timeout=10,\n", | ||
| " )\n", | ||
| " r.raise_for_status()\n", | ||
| " payload = r.json()\n", | ||
| " scaffold = payload[0].get(mode, \"\") if isinstance(payload, list) else \"\"\n", | ||
| " langfuse_context.update_current_observation(\n", | ||
| " metadata={\n", | ||
| " \"mode\": mode,\n", | ||
| " \"query_length\": len(query),\n", | ||
| " \"scaffold_length\": len(scaffold),\n", | ||
| " },\n", | ||
| " )\n", | ||
| " return scaffold" | ||
| ] | ||
| }, | ||
| { | ||
| "cell_type": "markdown", | ||
| "metadata": {}, | ||
| "source": [ | ||
| "## 4. Run a two-turn agent and view the trace\n", | ||
| "\n", | ||
| "Each turn does the same shape:\n", | ||
| "\n", | ||
| "1. `call_ejentum(...)` (Langfuse observation with mode + scaffold length).\n", | ||
| "2. `openai.chat.completions.create(...)` from the auto-instrumented client.\n", | ||
| "\n", | ||
| "Both observations sit inside a single trace per turn, so you can drill from the trace summary into the harness retrieval, the system message it produced, and the model call that consumed it." | ||
| ] | ||
| }, | ||
| { | ||
| "cell_type": "code", | ||
| "execution_count": null, | ||
| "metadata": {}, | ||
| "outputs": [], | ||
|
Comment on lines
+130
to
+139
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
The notebook instructs readers to filter the Traces view by Prompt To Fix With AIThis is a comment left during a code review.
Path: cookbook/integration_ejentum_cognitive_harness.ipynb
Line: 128-137
Comment:
**Trace-level metadata missing — described UI filters will not work**
The notebook instructs readers to filter the Traces view by `metadata.mode` and `metadata.scaffold_length`, but `langfuse_context.update_current_observation(metadata=...)` inside `call_ejentum` attaches those keys to the child span (`ejentum.call`), not to the root trace created by `harness_augmented_turn`. The Traces view in Langfuse filters on trace-level metadata, so both suggested filter expressions will silently return no results. To surface these values at the trace level, `langfuse_context.update_current_trace(metadata={"mode": mode, "scaffold_length": ...})` should be called (from within `call_ejentum` or `harness_augmented_turn`) in addition to the existing observation-level update.
How can I resolve this? If you propose a fix, please make it concise. |
||
| "source": [ | ||
| "@observe(name=\"harness_augmented_turn\")\n", | ||
| "def harness_augmented_turn(task: str, mode: str = \"reasoning\") -> str:\n", | ||
| " scaffold = call_ejentum(task, mode=mode)\n", | ||
| " completion = openai.chat.completions.create(\n", | ||
| " model=\"gpt-4o-mini\",\n", | ||
| " messages=[\n", | ||
| " {\n", | ||
| " \"role\": \"system\",\n", | ||
| " \"content\": (\n", | ||
| " \"Apply the cognitive scaffold below before answering.\\n\\n\"\n", | ||
| " f\"[SCAFFOLD]\\n{scaffold}\\n[END]\"\n", | ||
| " ),\n", | ||
| " },\n", | ||
| " {\"role\": \"user\", \"content\": task},\n", | ||
| " ],\n", | ||
| " temperature=0,\n", | ||
| " )\n", | ||
| " return completion.choices[0].message.content or \"\"\n", | ||
| "\n", | ||
| "\n", | ||
| "print(harness_augmented_turn(\n", | ||
| " \"We have 50M users and want to add a NOT NULL column. \"\n", | ||
| " \"Walk through the trade-offs and recommend a migration path.\",\n", | ||
| " mode=\"reasoning\",\n", | ||
| "))\n", | ||
| "print(\"---\")\n", | ||
| "print(harness_augmented_turn(\n", | ||
| " \"A user insists their DB is the bottleneck and refuses other diagnostics. \"\n", | ||
| " \"Walk through whether the framing is sound before recommending.\",\n", | ||
| " mode=\"anti-deception\",\n", | ||
| "))\n", | ||
| "\n", | ||
| "# Flush so the traces show up immediately rather than on process exit.\n", | ||
| "langfuse.flush()" | ||
| ] | ||
| }, | ||
| { | ||
| "cell_type": "markdown", | ||
| "metadata": {}, | ||
| "source": [ | ||
| "## What to look at in Langfuse\n", | ||
| "\n", | ||
| "Open your Langfuse project. For each turn you should see one trace with two child observations:\n", | ||
| "\n", | ||
| "- `ejentum.call`: the harness retrieval. Open the metadata to confirm `mode` and `scaffold_length`.\n", | ||
| "- `openai.chat.completions.create` (from `langfuse.openai`): the model call. Token usage, model name, prompt, and response are captured automatically.\n", | ||
| "\n", | ||
| "Useful filters in the Traces view:\n", | ||
| "\n", | ||
| "- `metadata.mode = \"anti-deception\"` to see only deception-resistance turns.\n", | ||
| "- `metadata.scaffold_length > 0` to confirm every harness call returned a non-empty scaffold.\n", | ||
| "\n", | ||
| "## Adapting the pattern\n", | ||
| "\n", | ||
| "Any in-loop REST call to a third-party tool can be instrumented the same way. The four steps in this cookbook (drop-in OpenAI client + `@observe`-decorated REST wrapper + metadata on the observation + scaffold-aware system message) generalise to harness calls in any agent loop. For larger agent frameworks, swap step 4 for the framework's native control flow; the observation shape stays the same." | ||
| ] | ||
| } | ||
| ], | ||
| "metadata": { | ||
| "kernelspec": { | ||
| "display_name": "Python 3", | ||
| "language": "python", | ||
| "name": "python3" | ||
| }, | ||
| "language_info": { | ||
| "name": "python", | ||
| "version": "3.11" | ||
| } | ||
| }, | ||
| "nbformat": 4, | ||
| "nbformat_minor": 5 | ||
| } | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
payload[0]is accessed without guarding against an empty list. If the Ejentum endpoint returns[](e.g. on a quota or validation error that still yields HTTP 200), this line throwsIndexErrorat runtime and the@observespan is never closed cleanly. Theisinstance(payload, list)check only protects against a non-list response, not an empty one.Prompt To Fix With AI