-
Notifications
You must be signed in to change notification settings - Fork 234
cookbook: add integration_ejentum_cognitive_harness notebook #2994
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
ejentum
wants to merge
2
commits into
langfuse:main
Choose a base branch
from
ejentum:cookbook-ejentum-cognitive-harness
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,212 @@ | ||
| { | ||
| "cells": [ | ||
| { | ||
| "cell_type": "markdown", | ||
| "metadata": {}, | ||
| "source": [ | ||
| "# Tracing an Ejentum cognitive harness agent loop in Langfuse\n", | ||
| "\n", | ||
| "This cookbook shows how to instrument an agent that calls the [Ejentum cognitive harness](https://ejentum.com) REST API between an LLM call and its response, so each turn shows up in Langfuse as a nested trace with the harness retrieval, the model call, and their token usage.\n", | ||
| "\n", | ||
| "The pattern is reusable. Any in-loop REST call to a third-party tool can be wrapped the same way; Ejentum is the worked example because the agent calls one endpoint with a `mode` argument, which keeps the cookbook short.\n", | ||
| "\n", | ||
| "## What you'll set up\n", | ||
| "\n", | ||
| "1. The Langfuse v3 client and a project to send traces to.\n", | ||
| "2. The `from langfuse.openai import openai` drop-in so OpenAI calls are auto-instrumented.\n", | ||
| "3. An `@observe`-decorated function around each Ejentum REST call so the harness retrieval appears as a child span, with the `mode` and a scaffold length attached via `update_current_span(metadata=...)`.\n", | ||
| "4. A two-turn agent that calls `harness_reasoning` then `harness_anti_deception` and answers a reasoning-heavy task each turn.\n", | ||
| "\n", | ||
| "Open the trace in your Langfuse project to inspect the tree per turn." | ||
| ] | ||
| }, | ||
| { | ||
| "cell_type": "markdown", | ||
| "metadata": {}, | ||
| "source": [ | ||
| "## 1. Install" | ||
| ] | ||
| }, | ||
| { | ||
| "cell_type": "code", | ||
| "execution_count": null, | ||
| "metadata": {}, | ||
| "outputs": [], | ||
| "source": [ | ||
| "%pip install -q \"langfuse>=3.0\" openai requests" | ||
| ] | ||
| }, | ||
| { | ||
| "cell_type": "markdown", | ||
| "metadata": {}, | ||
| "source": [ | ||
| "## 2. Configure Langfuse\n", | ||
| "\n", | ||
| "Set `LANGFUSE_PUBLIC_KEY`, `LANGFUSE_SECRET_KEY`, and `LANGFUSE_HOST` from your Langfuse project. The host defaults to `https://cloud.langfuse.com`; self-hosted users should set this to their instance URL." | ||
| ] | ||
| }, | ||
| { | ||
| "cell_type": "code", | ||
| "execution_count": null, | ||
| "metadata": {}, | ||
| "outputs": [], | ||
| "source": [ | ||
| "import os\n", | ||
| "\n", | ||
| "# Replace with your project's keys, or load from a .env file.\n", | ||
| "os.environ.setdefault(\"LANGFUSE_HOST\", \"https://cloud.langfuse.com\")\n", | ||
| "assert os.environ.get(\"LANGFUSE_PUBLIC_KEY\"), \"LANGFUSE_PUBLIC_KEY is not set.\"\n", | ||
| "assert os.environ.get(\"LANGFUSE_SECRET_KEY\"), \"LANGFUSE_SECRET_KEY is not set.\"\n", | ||
| "assert os.environ.get(\"OPENAI_API_KEY\"), \"OPENAI_API_KEY is not set.\"\n", | ||
| "assert os.environ.get(\"EJENTUM_API_KEY\"), \"EJENTUM_API_KEY is not set. Get one at https://ejentum.com/dashboard\"\n", | ||
| "\n", | ||
| "from langfuse import get_client\n", | ||
| "\n", | ||
| "langfuse = get_client()\n", | ||
| "assert langfuse.auth_check(), \"Langfuse auth check failed; verify your keys.\"\n", | ||
| "print(\"Langfuse client authenticated. Project:\", os.environ[\"LANGFUSE_HOST\"])" | ||
| ] | ||
| }, | ||
| { | ||
| "cell_type": "markdown", | ||
| "metadata": {}, | ||
| "source": [ | ||
| "## 3. Auto-instrument OpenAI and wrap the Ejentum REST call\n", | ||
| "\n", | ||
| "Langfuse ships a drop-in for OpenAI (`from langfuse.openai import openai`). It captures the request, response, model, and token usage automatically. For Ejentum we add a thin `@observe`-decorated wrapper so each harness retrieval shows up as a child span in the same trace, with the `mode` and scaffold length surfaced via `langfuse.update_current_span(metadata=...)`." | ||
| ] | ||
| }, | ||
| { | ||
| "cell_type": "code", | ||
| "execution_count": null, | ||
| "metadata": {}, | ||
| "outputs": [], | ||
| "source": [ | ||
| "import requests\n", | ||
| "from langfuse import observe\n", | ||
| "from langfuse.openai import openai\n", | ||
| "\n", | ||
| "EJENTUM_URL = \"https://ejentum-main-ab125c3.zuplo.app/logicv1/\"\n", | ||
| "\n", | ||
| "\n", | ||
| "@observe(name=\"ejentum.call\")\n", | ||
| "def call_ejentum(query: str, mode: str = \"reasoning\") -> str:\n", | ||
| " \"\"\"Fetch a cognitive scaffold from the Ejentum REST gateway.\n", | ||
| "\n", | ||
| " Emits a Langfuse span with input, output, and metadata\n", | ||
| " (mode, query_length, scaffold_length).\n", | ||
| " \"\"\"\n", | ||
| " r = requests.post(\n", | ||
| " EJENTUM_URL,\n", | ||
| " headers={\n", | ||
| " \"Authorization\": f\"Bearer {os.environ['EJENTUM_API_KEY']}\",\n", | ||
| " \"Content-Type\": \"application/json\",\n", | ||
| " },\n", | ||
| " json={\"query\": query, \"mode\": mode},\n", | ||
| " timeout=10,\n", | ||
| " )\n", | ||
| " r.raise_for_status()\n", | ||
| " payload = r.json()\n", | ||
| " scaffold = payload[0].get(mode, \"\") if isinstance(payload, list) else \"\"\n", | ||
| " langfuse.update_current_span(\n", | ||
| " metadata={\n", | ||
| " \"mode\": mode,\n", | ||
| " \"query_length\": len(query),\n", | ||
| " \"scaffold_length\": len(scaffold),\n", | ||
| " },\n", | ||
| " )\n", | ||
| " return scaffold" | ||
| ] | ||
| }, | ||
| { | ||
| "cell_type": "markdown", | ||
| "metadata": {}, | ||
| "source": [ | ||
| "## 4. Run a two-turn agent and view the trace\n", | ||
| "\n", | ||
| "Each turn does the same shape:\n", | ||
| "\n", | ||
| "1. `call_ejentum(...)` (Langfuse span with mode + scaffold length).\n", | ||
| "2. `openai.chat.completions.create(...)` from the auto-instrumented client.\n", | ||
| "\n", | ||
| "Both spans sit inside a single trace per turn, so you can drill from the trace summary into the harness retrieval, the system message it produced, and the model call that consumed it." | ||
| ] | ||
| }, | ||
| { | ||
| "cell_type": "code", | ||
| "execution_count": null, | ||
| "metadata": {}, | ||
| "outputs": [], | ||
| "source": [ | ||
| "@observe(name=\"harness_augmented_turn\")\n", | ||
| "def harness_augmented_turn(task: str, mode: str = \"reasoning\") -> str:\n", | ||
| " scaffold = call_ejentum(task, mode=mode)\n", | ||
| " completion = openai.chat.completions.create(\n", | ||
| " model=\"gpt-4o-mini\",\n", | ||
| " messages=[\n", | ||
| " {\n", | ||
| " \"role\": \"system\",\n", | ||
| " \"content\": (\n", | ||
| " \"Apply the cognitive scaffold below before answering.\\n\\n\"\n", | ||
| " f\"[SCAFFOLD]\\n{scaffold}\\n[END]\"\n", | ||
| " ),\n", | ||
| " },\n", | ||
| " {\"role\": \"user\", \"content\": task},\n", | ||
| " ],\n", | ||
| " temperature=0,\n", | ||
| " )\n", | ||
| " return completion.choices[0].message.content or \"\"\n", | ||
| "\n", | ||
| "\n", | ||
| "print(harness_augmented_turn(\n", | ||
| " \"We have 50M users and want to add a NOT NULL column. \"\n", | ||
| " \"Walk through the trade-offs and recommend a migration path.\",\n", | ||
| " mode=\"reasoning\",\n", | ||
| "))\n", | ||
| "print(\"---\")\n", | ||
| "print(harness_augmented_turn(\n", | ||
| " \"A user insists their DB is the bottleneck and refuses other diagnostics. \"\n", | ||
| " \"Walk through whether the framing is sound before recommending.\",\n", | ||
| " mode=\"anti-deception\",\n", | ||
| "))\n", | ||
| "\n", | ||
| "# Flush so the traces show up immediately rather than on process exit.\n", | ||
| "langfuse.flush()" | ||
| ] | ||
| }, | ||
| { | ||
| "cell_type": "markdown", | ||
| "metadata": {}, | ||
| "source": [ | ||
| "## What to look at in Langfuse\n", | ||
| "\n", | ||
| "Open your Langfuse project. For each turn you should see one trace with two child spans:\n", | ||
| "\n", | ||
| "- `ejentum.call`: the harness retrieval. Open the metadata panel to confirm `mode` and `scaffold_length`.\n", | ||
| "- `openai.chat.completions.create` (from `langfuse.openai`): the model call. Token usage, model name, prompt, and response are captured automatically.\n", | ||
| "\n", | ||
| "Useful filters in the Traces view:\n", | ||
| "\n", | ||
| "- `metadata.mode = \"anti-deception\"` to see only deception-resistance turns.\n", | ||
| "- `metadata.scaffold_length > 0` to confirm every harness call returned a non-empty scaffold.\n", | ||
| "\n", | ||
| "## Adapting the pattern\n", | ||
| "\n", | ||
| "Any in-loop REST call to a third-party tool can be instrumented the same way. The four steps in this cookbook (drop-in OpenAI client + `@observe`-decorated REST wrapper + metadata via `update_current_span` + scaffold-aware system message) generalise to harness calls in any agent loop. For larger agent frameworks, swap step 4 for the framework's native control flow; the span shape stays the same." | ||
| ] | ||
| } | ||
| ], | ||
| "metadata": { | ||
| "kernelspec": { | ||
| "display_name": "Python 3", | ||
| "language": "python", | ||
| "name": "python3" | ||
| }, | ||
| "language_info": { | ||
| "name": "python", | ||
| "version": "3.11" | ||
| } | ||
| }, | ||
| "nbformat": 4, | ||
| "nbformat_minor": 5 | ||
| } | ||
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The notebook instructs readers to filter the Traces view by
metadata.modeandmetadata.scaffold_length, butlangfuse_context.update_current_observation(metadata=...)insidecall_ejentumattaches those keys to the child span (ejentum.call), not to the root trace created byharness_augmented_turn. The Traces view in Langfuse filters on trace-level metadata, so both suggested filter expressions will silently return no results. To surface these values at the trace level,langfuse_context.update_current_trace(metadata={"mode": mode, "scaffold_length": ...})should be called (from withincall_ejentumorharness_augmented_turn) in addition to the existing observation-level update.Prompt To Fix With AI