Intencion captures every agent run end to end: the intent behind it, the tool calls it made, and how it ended. You add it in one line and keep your agent exactly as it is. This page is the honest version of how it works, including what is not built yet.
Published for TypeScript and Python. No agent framework required, and no framework is off limits.
npm i @intencion/sdk # TypeScript / Node pip install intencion # Python
The fast path instruments your model client once and captures every call automatically. The explicit path lets you name the intent and record your own tool steps. They compose: instrument once, and wrap the runs where you want tool-level detail.
One line patches the provider SDK at the class (prototype) level, so it also covers clients that frameworks like LangChain, the OpenAI Agents SDK, and LlamaIndex build internally.
import OpenAI from "openai";
import { Intencion } from "@intencion/sdk";
const ix = new Intencion({ apiKey: process.env.INTENCION_API_KEY });
const openai = ix.instrumentOpenAI(new OpenAI());
// Just use the client. Every call is captured: model, tokens,
// latency, outcome. The intent is inferred server-side.
await openai.chat.completions.create({
model: "gpt-4o-mini",
messages: [{ role: "user", content: userMessage }],
});
// ix.instrumentAnthropic(new Anthropic()) works the same way.Wrap a run and call run.step() for each tool. Model calls inside are still auto-captured, so you only annotate your own functions. No decorators to sprinkle everywhere.
await ix.run({ intent: "refund_request", input, user }, async (run) => {
const order = await lookupOrder(id);
run.step({ name: "lookup_order", tool: "orders-db", status: "success", ms: 24 });
const refund = await issueRefund(order);
run.step({ name: "issue_refund", tool: "payments", status: "success", ms: 120 });
return refund; // returns cleanly => success; throw => failure
});
// In a serverless function, await ix.flush() before returning.For a chat agent, wrap the conversation in ix.session() and each turn in ix.run(). The agent tool loop folds into steps under that one run, so a multi-turn chat becomes one run per turn, grouped by session and in order, visible at /app/sessions. Without the per-turn run, each model call becomes its own run and the conversation scatters.
// One conversation = one session. Each user turn = one run.
async function handleTurn(conversationId, userId, message) {
return ix.session({ session: conversationId, user: userId }, () =>
ix.run({ intent: "auto", input: message }, async (run) => {
// your agent's tool loop: each model call folds in as a step,
// each tool call is a run.step(...). Many calls -> ONE run per turn.
return await runAgent(message);
})
);
}
// Call it for every message with the SAME conversationId.No model guesses whether your agent succeeded. A run that returns is a success. A run that throws is a failure, and the error message becomes the reason. That is the default, with zero extra code.
Call run.ok(), run.fail(reason), or run.abandon() inside the run to set the outcome yourself. Abandoned means the user or agent gave up, which is different from an error. Or map a business result declaratively with a per-run success() predicate, or a global classifyOutcome resolver, so you set the outcome in one place.
Most agents catch a tool error and reply politely, so the function still returns. Wrap a tool in run.tool(name, fn) and the step is marked errored when the tool throws. A run that returns with an errored step is recorded as degraded, not success, so a caught failure is never counted as a win. With debug on, a 100% success rate warns you that outcomes may not be classified at all.
Because outcomes come from how the run actually ended, there is no classifier sitting in the middle and no accuracy rate to babysit. Replay the same run and you get the same outcome every time.
Pass intent to ix.run() and that label is used as is. If you do not, an auto-instrumented call sends no intent and we infer one server-side, after the response, so nothing is added to your request path.
We embed the input and match it to your existing intent clusters by cosine similarity. Only when nothing is close enough do we ask a small model (Claude Haiku) to name a new intent in lower_snake_case, and it is told to reuse your existing labels first. Your taxonomy stays stable instead of sprouting near-duplicates.
You control labels today by declaring intents. Editing, merging, and renaming the taxonomy from the dashboard is on the roadmap. The Emerging view is a newly-seen-label heuristic, not offline ML clustering yet.
Capturing a call adds about a millisecond. Runs are queued in memory and flushed in the background, so your model response returns to the user before anything is sent to us.
The queue flushes every 5 seconds, or sooner at 100 runs or roughly 500 KB. Under backpressure it is bounded and drops the oldest runs rather than blocking your agent. On serverless it flushes via the platform after() hook, and you can await ix.flush() before a function returns.
Personal details are redacted before anything is stored. Emails, credit-card numbers (Luhn checked), US Social Security numbers, and phone numbers are replaced with markers like <EMAIL> and <CC>. Redaction runs in your process before send, and again on our server before persist, so it is enforced twice. Full model responses are never proxied.
Intencion is the product layer, not a tracing backend. It coexists with your LLM observability (Langfuse, LangSmith, Datadog) and your product analytics (Amplitude, PostHog). Those describe spans, tokens, and events. Intencion answers what users wanted and whether they got it. We do not ask you to adopt a span format or re-emit your traces. An OpenTelemetry bridge for teams that already emit spans is on the roadmap; until then the SDK is the integration.
So you know exactly where the edges are today.
Programmatic export and webhooks
There is no JSON or CSV export endpoint or webhook stream yet. We will help you offboard your data in the meantime.
OpenTelemetry bridge
We patch the provider SDK directly today. An OTel bridge for teams with an existing collector is planned.
Taxonomy editing
Renaming and merging intents from the dashboard is planned. For now, declare intents to control labels precisely.
Offline clustering
Emerging intents use a newly-seen-label heuristic. A BERTopic/Clio-style offline cluster pass is planned.
Self-serve billing
The Free tier self-serves now. Buying a paid plan is demo-led while billing is wired up.
Response capture
We capture intent, steps, tokens, latency, and outcome. Full assistant output capture is opt-in and rolling out.
Start free in a minute, or click through the live demo first.