From accda6f65fb3341b1c8e1b6b7735c8f5f07dd416 Mon Sep 17 00:00:00 2001 From: "ejentum.com" Date: Sat, 23 May 2026 20:10:54 +0300 Subject: [PATCH 1/2] cookbook: add integration_ejentum_cognitive_harness notebook Adds a cookbook notebook that traces an Ejentum cognitive harness agent loop in Langfuse. Each turn produces a Langfuse trace with two child observations: ejentum.call (manual @observe-decorated REST wrapper with mode and scaffold length on metadata) and openai.chat.completions.create (from the langfuse.openai drop-in). The pattern generalises to any third-party REST tool an agent calls in-loop; Ejentum is the worked example because the agent calls a single endpoint with a mode arg. --- ...ntegration_ejentum_cognitive_harness.ipynb | 210 ++++++++++++++++++ 1 file changed, 210 insertions(+) create mode 100644 cookbook/integration_ejentum_cognitive_harness.ipynb diff --git a/cookbook/integration_ejentum_cognitive_harness.ipynb b/cookbook/integration_ejentum_cognitive_harness.ipynb new file mode 100644 index 0000000000..8f6ee99ec8 --- /dev/null +++ b/cookbook/integration_ejentum_cognitive_harness.ipynb @@ -0,0 +1,210 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Tracing an Ejentum cognitive harness agent loop in Langfuse\n", + "\n", + "This cookbook shows how to instrument an agent that calls the [Ejentum cognitive harness](https://ejentum.com) REST API between an LLM call and its response, so each turn shows up in Langfuse as a nested trace with the harness retrieval, the model call, and their token usage.\n", + "\n", + "The pattern is reusable. Any in-loop REST call to a third-party tool can be wrapped the same way; Ejentum is the worked example because the agent calls one endpoint with a `mode` argument, which keeps the cookbook short.\n", + "\n", + "## What you'll set up\n", + "\n", + "1. The Langfuse client and a project to send traces to.\n", + "2. The `from langfuse.openai import openai` drop-in so OpenAI calls are auto-instrumented.\n", + "3. An `@observe`-decorated function around each Ejentum REST call so the harness retrieval appears as a child observation with the `mode` and a scaffold length attached.\n", + "4. A two-turn agent that calls `harness_reasoning` then `harness_anti_deception` and answers a reasoning-heavy task each turn.\n", + "\n", + "Open the trace in your Langfuse project to inspect the tree per turn." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 1. Install" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "%pip install -q langfuse openai requests" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 2. Configure Langfuse\n", + "\n", + "Set `LANGFUSE_PUBLIC_KEY`, `LANGFUSE_SECRET_KEY`, and `LANGFUSE_HOST` from your Langfuse project. The host defaults to `https://cloud.langfuse.com`; self-hosted users should set this to their instance URL." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import os\n", + "\n", + "# Replace with your project's keys, or load from a .env file.\n", + "os.environ.setdefault(\"LANGFUSE_HOST\", \"https://cloud.langfuse.com\")\n", + "assert os.environ.get(\"LANGFUSE_PUBLIC_KEY\"), \"LANGFUSE_PUBLIC_KEY is not set.\"\n", + "assert os.environ.get(\"LANGFUSE_SECRET_KEY\"), \"LANGFUSE_SECRET_KEY is not set.\"\n", + "assert os.environ.get(\"OPENAI_API_KEY\"), \"OPENAI_API_KEY is not set.\"\n", + "assert os.environ.get(\"EJENTUM_API_KEY\"), \"EJENTUM_API_KEY is not set. Get one at https://ejentum.com/dashboard\"\n", + "\n", + "from langfuse import Langfuse\n", + "langfuse = Langfuse()\n", + "print(\"Langfuse project URL:\", os.environ[\"LANGFUSE_HOST\"])" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 3. Auto-instrument OpenAI and wrap the Ejentum REST call\n", + "\n", + "Langfuse ships a drop-in for OpenAI (`from langfuse.openai import openai`). It captures the request, response, model, and token usage automatically. For Ejentum we add a thin `@observe`-decorated wrapper so each harness retrieval shows up as a child observation in the same trace, with the `mode` and scaffold length surfaced for filtering." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import requests\n", + "from langfuse.decorators import langfuse_context, observe\n", + "from langfuse.openai import openai\n", + "\n", + "EJENTUM_URL = \"https://ejentum-main-ab125c3.zuplo.app/logicv1/\"\n", + "\n", + "\n", + "@observe(name=\"ejentum.call\")\n", + "def call_ejentum(query: str, mode: str = \"reasoning\") -> str:\n", + " \"\"\"Fetch a cognitive scaffold from the Ejentum REST gateway.\n", + "\n", + " Emits a Langfuse observation with input, output, and metadata\n", + " (mode, query_length, scaffold_length).\n", + " \"\"\"\n", + " r = requests.post(\n", + " EJENTUM_URL,\n", + " headers={\n", + " \"Authorization\": f\"Bearer {os.environ['EJENTUM_API_KEY']}\",\n", + " \"Content-Type\": \"application/json\",\n", + " },\n", + " json={\"query\": query, \"mode\": mode},\n", + " timeout=10,\n", + " )\n", + " r.raise_for_status()\n", + " payload = r.json()\n", + " scaffold = payload[0].get(mode, \"\") if isinstance(payload, list) else \"\"\n", + " langfuse_context.update_current_observation(\n", + " metadata={\n", + " \"mode\": mode,\n", + " \"query_length\": len(query),\n", + " \"scaffold_length\": len(scaffold),\n", + " },\n", + " )\n", + " return scaffold" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 4. Run a two-turn agent and view the trace\n", + "\n", + "Each turn does the same shape:\n", + "\n", + "1. `call_ejentum(...)` (Langfuse observation with mode + scaffold length).\n", + "2. `openai.chat.completions.create(...)` from the auto-instrumented client.\n", + "\n", + "Both observations sit inside a single trace per turn, so you can drill from the trace summary into the harness retrieval, the system message it produced, and the model call that consumed it." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "@observe(name=\"harness_augmented_turn\")\n", + "def harness_augmented_turn(task: str, mode: str = \"reasoning\") -> str:\n", + " scaffold = call_ejentum(task, mode=mode)\n", + " completion = openai.chat.completions.create(\n", + " model=\"gpt-4o-mini\",\n", + " messages=[\n", + " {\n", + " \"role\": \"system\",\n", + " \"content\": (\n", + " \"Apply the cognitive scaffold below before answering.\\n\\n\"\n", + " f\"[SCAFFOLD]\\n{scaffold}\\n[END]\"\n", + " ),\n", + " },\n", + " {\"role\": \"user\", \"content\": task},\n", + " ],\n", + " temperature=0,\n", + " )\n", + " return completion.choices[0].message.content or \"\"\n", + "\n", + "\n", + "print(harness_augmented_turn(\n", + " \"We have 50M users and want to add a NOT NULL column. \"\n", + " \"Walk through the trade-offs and recommend a migration path.\",\n", + " mode=\"reasoning\",\n", + "))\n", + "print(\"---\")\n", + "print(harness_augmented_turn(\n", + " \"A user insists their DB is the bottleneck and refuses other diagnostics. \"\n", + " \"Walk through whether the framing is sound before recommending.\",\n", + " mode=\"anti-deception\",\n", + "))\n", + "\n", + "# Flush so the traces show up immediately rather than on process exit.\n", + "langfuse.flush()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## What to look at in Langfuse\n", + "\n", + "Open your Langfuse project. For each turn you should see one trace with two child observations:\n", + "\n", + "- `ejentum.call`: the harness retrieval. Open the metadata to confirm `mode` and `scaffold_length`.\n", + "- `openai.chat.completions.create` (from `langfuse.openai`): the model call. Token usage, model name, prompt, and response are captured automatically.\n", + "\n", + "Useful filters in the Traces view:\n", + "\n", + "- `metadata.mode = \"anti-deception\"` to see only deception-resistance turns.\n", + "- `metadata.scaffold_length > 0` to confirm every harness call returned a non-empty scaffold.\n", + "\n", + "## Adapting the pattern\n", + "\n", + "Any in-loop REST call to a third-party tool can be instrumented the same way. The four steps in this cookbook (drop-in OpenAI client + `@observe`-decorated REST wrapper + metadata on the observation + scaffold-aware system message) generalise to harness calls in any agent loop. For larger agent frameworks, swap step 4 for the framework's native control flow; the observation shape stays the same." + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "name": "python", + "version": "3.11" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} From 38768f012a740d4eca5f4d042a6c0e8764758ca7 Mon Sep 17 00:00:00 2001 From: "ejentum.com" Date: Sat, 23 May 2026 21:35:23 +0300 Subject: [PATCH 2/2] fix(cookbook): migrate to langfuse v3 API (get_client + update_current_span) --- ...ntegration_ejentum_cognitive_harness.ipynb | 32 ++++++++++--------- 1 file changed, 17 insertions(+), 15 deletions(-) diff --git a/cookbook/integration_ejentum_cognitive_harness.ipynb b/cookbook/integration_ejentum_cognitive_harness.ipynb index 8f6ee99ec8..8d7a5189d7 100644 --- a/cookbook/integration_ejentum_cognitive_harness.ipynb +++ b/cookbook/integration_ejentum_cognitive_harness.ipynb @@ -12,9 +12,9 @@ "\n", "## What you'll set up\n", "\n", - "1. The Langfuse client and a project to send traces to.\n", + "1. The Langfuse v3 client and a project to send traces to.\n", "2. The `from langfuse.openai import openai` drop-in so OpenAI calls are auto-instrumented.\n", - "3. An `@observe`-decorated function around each Ejentum REST call so the harness retrieval appears as a child observation with the `mode` and a scaffold length attached.\n", + "3. An `@observe`-decorated function around each Ejentum REST call so the harness retrieval appears as a child span, with the `mode` and a scaffold length attached via `update_current_span(metadata=...)`.\n", "4. A two-turn agent that calls `harness_reasoning` then `harness_anti_deception` and answers a reasoning-heavy task each turn.\n", "\n", "Open the trace in your Langfuse project to inspect the tree per turn." @@ -33,7 +33,7 @@ "metadata": {}, "outputs": [], "source": [ - "%pip install -q langfuse openai requests" + "%pip install -q \"langfuse>=3.0\" openai requests" ] }, { @@ -60,9 +60,11 @@ "assert os.environ.get(\"OPENAI_API_KEY\"), \"OPENAI_API_KEY is not set.\"\n", "assert os.environ.get(\"EJENTUM_API_KEY\"), \"EJENTUM_API_KEY is not set. Get one at https://ejentum.com/dashboard\"\n", "\n", - "from langfuse import Langfuse\n", - "langfuse = Langfuse()\n", - "print(\"Langfuse project URL:\", os.environ[\"LANGFUSE_HOST\"])" + "from langfuse import get_client\n", + "\n", + "langfuse = get_client()\n", + "assert langfuse.auth_check(), \"Langfuse auth check failed; verify your keys.\"\n", + "print(\"Langfuse client authenticated. Project:\", os.environ[\"LANGFUSE_HOST\"])" ] }, { @@ -71,7 +73,7 @@ "source": [ "## 3. Auto-instrument OpenAI and wrap the Ejentum REST call\n", "\n", - "Langfuse ships a drop-in for OpenAI (`from langfuse.openai import openai`). It captures the request, response, model, and token usage automatically. For Ejentum we add a thin `@observe`-decorated wrapper so each harness retrieval shows up as a child observation in the same trace, with the `mode` and scaffold length surfaced for filtering." + "Langfuse ships a drop-in for OpenAI (`from langfuse.openai import openai`). It captures the request, response, model, and token usage automatically. For Ejentum we add a thin `@observe`-decorated wrapper so each harness retrieval shows up as a child span in the same trace, with the `mode` and scaffold length surfaced via `langfuse.update_current_span(metadata=...)`." ] }, { @@ -81,7 +83,7 @@ "outputs": [], "source": [ "import requests\n", - "from langfuse.decorators import langfuse_context, observe\n", + "from langfuse import observe\n", "from langfuse.openai import openai\n", "\n", "EJENTUM_URL = \"https://ejentum-main-ab125c3.zuplo.app/logicv1/\"\n", @@ -91,7 +93,7 @@ "def call_ejentum(query: str, mode: str = \"reasoning\") -> str:\n", " \"\"\"Fetch a cognitive scaffold from the Ejentum REST gateway.\n", "\n", - " Emits a Langfuse observation with input, output, and metadata\n", + " Emits a Langfuse span with input, output, and metadata\n", " (mode, query_length, scaffold_length).\n", " \"\"\"\n", " r = requests.post(\n", @@ -106,7 +108,7 @@ " r.raise_for_status()\n", " payload = r.json()\n", " scaffold = payload[0].get(mode, \"\") if isinstance(payload, list) else \"\"\n", - " langfuse_context.update_current_observation(\n", + " langfuse.update_current_span(\n", " metadata={\n", " \"mode\": mode,\n", " \"query_length\": len(query),\n", @@ -124,10 +126,10 @@ "\n", "Each turn does the same shape:\n", "\n", - "1. `call_ejentum(...)` (Langfuse observation with mode + scaffold length).\n", + "1. `call_ejentum(...)` (Langfuse span with mode + scaffold length).\n", "2. `openai.chat.completions.create(...)` from the auto-instrumented client.\n", "\n", - "Both observations sit inside a single trace per turn, so you can drill from the trace summary into the harness retrieval, the system message it produced, and the model call that consumed it." + "Both spans sit inside a single trace per turn, so you can drill from the trace summary into the harness retrieval, the system message it produced, and the model call that consumed it." ] }, { @@ -178,9 +180,9 @@ "source": [ "## What to look at in Langfuse\n", "\n", - "Open your Langfuse project. For each turn you should see one trace with two child observations:\n", + "Open your Langfuse project. For each turn you should see one trace with two child spans:\n", "\n", - "- `ejentum.call`: the harness retrieval. Open the metadata to confirm `mode` and `scaffold_length`.\n", + "- `ejentum.call`: the harness retrieval. Open the metadata panel to confirm `mode` and `scaffold_length`.\n", "- `openai.chat.completions.create` (from `langfuse.openai`): the model call. Token usage, model name, prompt, and response are captured automatically.\n", "\n", "Useful filters in the Traces view:\n", @@ -190,7 +192,7 @@ "\n", "## Adapting the pattern\n", "\n", - "Any in-loop REST call to a third-party tool can be instrumented the same way. The four steps in this cookbook (drop-in OpenAI client + `@observe`-decorated REST wrapper + metadata on the observation + scaffold-aware system message) generalise to harness calls in any agent loop. For larger agent frameworks, swap step 4 for the framework's native control flow; the observation shape stays the same." + "Any in-loop REST call to a third-party tool can be instrumented the same way. The four steps in this cookbook (drop-in OpenAI client + `@observe`-decorated REST wrapper + metadata via `update_current_span` + scaffold-aware system message) generalise to harness calls in any agent loop. For larger agent frameworks, swap step 4 for the framework's native control flow; the span shape stays the same." ] } ],