Changelog - Trodo

May 2026

May 11, 2026 — UX Signals Dashboard, SDK 2.4.2, and Chat Tool Expansion

UX Signals DashboardA dedicated UX Signals dashboard is now available from the main navigation, consolidating the four UX telemetry surfaces the auto-events SDK has been collecting since March into a single coherent view.Click heatmap — A canvas overlay renders aggregated click coordinates across any page path. Click density is expressed as a heat gradient from cool blue to red. The heatmap can be scoped by date range, device class, and user segment. Selecting any region opens a filtered event list showing the raw clicks within that area.Scroll depth chart — A horizontal bar chart shows the percentage of sessions that reached each 10 % depth milestone for a given page. A dashed line marks the fold threshold inferred from the median viewport height reported in page_performance events.Rage click analysis — A ranked table of elements most frequently rage-clicked, keyed by the element_selector property on rage_click events. Each row shows the element path, unique user count, share of sessions, and a daily frequency sparkline. Clicking a row opens the session list filtered to sessions containing at least one rage click on that element.Form abandonment funnel — form_start, form_submit, and form_abandon events are assembled into a mini-funnel per form identifier. Abandonment rate and median time-to-abandon are displayed alongside the most common field at which users stopped interacting, derived from the field_name property on field_blur events.SDK 2.4.2: ESM fix, OTel v2, span ID correction, debug flagFour issues discovered in production after the 2.4.0 release are resolved in this patch.

ESM compatibility — Projects using "type": "module" in package.json received require is not defined because the internal loader for the optional @opentelemetry/sdk-node peer dependency used synchronous require(). SDK 2.4.2 replaces it with a dynamic import() guarded by a try/catch, valid in both ESM and CJS environments.
OTel v2 Resource constructor — OpenTelemetry API v2 changed the Resource constructor signature. The registerOtel helper now detects the OTel API version at runtime via the VERSION export and branches between the v1 and v2 constructor forms, eliminating deprecation warnings and TypeErrors on @opentelemetry/api@^2.0.
Span ID format — A refactor in SDK 2.3.0 inadvertently changed span IDs from 16-character lowercase hex (the OTel wire format) to hyphenated UUIDs, breaking correlation workflows that joined on span_id. SDK 2.4.2 restores the 16-character hex format. Existing rows are not backfilled.
Debug flag — A new debug: true option on init() and registerOtel() prints each exported span (name, trace ID, span ID, attributes, duration) and the OTLP endpoint URL to the console. Defaults to false and is not read from environment variables to avoid accidental enablement.

Chat: issues, heal, and cluster toolsThree new tool groups extend the Chat interface to cover the remaining observability surfaces.

Issues tools — List open issues, fetch detail for a specific issue (affected runs, severity history, timeline, example spans), filter by detector type or severity, and update status (acknowledged, resolved, muted).
Heal tools — Retrieve the root-cause narrative and suggested fix for any issue in the Heal queue. Trigger a manual heal analysis on a specified issue ID and receive results inline.
Cluster tools — Enumerate clusters for a given agent, fetch topic labels and member counts, list runs assigned to a cluster, and retrieve the 2D canvas coordinates. Enables queries such as “which cluster has the most failed runs?” or “list the top 10 runs in the onboarding cluster”.

May 7, 2026 — Heal Workflows and Semantic Search

Heal: Automated Root-Cause Analysis and Fix SuggestionThe Heal system is now generally available. It adds a structured remediation workflow to the Issues surface, targeting issues that have been open for more than 24 hours or have spiked in severity since their last check.Heal queue — A “Heal” tab in the main navigation lists all issues currently queued for analysis. Items enter the queue automatically when an issue’s severity score crosses the configured threshold for the first time, or re-crosses it after a previous resolution. Operators can also manually enqueue any issue from its detail page.Root-cause narrative — The Heal pipeline fetches the 20 most recent runs associated with the issue, extracts the relevant spans and tool call inputs/outputs, and sends a structured prompt to a language model asking it to identify the common failure pattern. The model returns a free-prose narrative (typically 3–5 sentences) stored in the issue detail sidebar under “Root Cause”.Suggested fix — The same prompt asks the model to propose a concrete remediation step — a code snippet, configuration change, or process recommendation. The suggestion is displayed in a “Suggested Fix” panel. It is advisory only; no automated changes are made.Regression guard — After an issue is marked resolved, the Heal system continues monitoring incoming runs. If the issue’s per-day event count re-crosses the severity threshold within 14 days, the issue is automatically re-opened and re-enqueued with a “Regression” label. A Slack notification is sent if a webhook is configured in team settings.Semantic SearchA new Cmd+K (Ctrl+K on Windows/Linux) shortcut opens the global search modal from any page in the dashboard. Semantic search accepts free-form natural language queries and returns results from three index types simultaneously: agent runs, individual spans, and analytics events.How it works — The search service computes a 1536-dimension embedding of the query text using text-embedding-3-small. A cosine similarity scan over the semantic_search_documents table returns the top candidates across all three document types, ranked by a linear combination of embedding similarity and a recency score that decays over 30 days.LLM refinement — For queries that produce low average similarity scores (below a confidence threshold defaulting to 0.72), the service sends the query and top-10 candidates to a language model for re-ranking. The model selects the candidates that genuinely answer the query and generates a one-sentence explanation for each. Refined results display the LLM annotation below each result card.Confidence gating — If after LLM refinement fewer than two results exceed the threshold, a disambiguation panel replaces the result list with up to three suggested reformulations and a free-text field for manual refinement.Background indexing — New agent runs are indexed within 60 seconds of completion by a BullMQ worker job that concatenates the agent name, input, output, all tool names, and all span names, computes the embedding, and inserts the vector into semantic_search_documents. The title and snippet fields carry a trigram GIN index to support hybrid queries that mix keyword prefix matching with embedding similarity.

May 4, 2026 — Vercel AI SDK Integration and Additional Provider Support

Vercel AI SDK: Zero-Code InstrumentationTeams using the Vercel AI SDK can now send agent run telemetry to Trodo without importing the Trodo SDK at all. Set two environment variables before the process starts:

TRODO_API_KEY=your_write_key
TRODO_AGENT_NAME=your_agent_name

Trodo’s OTLP ingest endpoint accepts the telemetry the Vercel AI SDK emits when experimental_telemetry: { isEnabled: true } is set on generateText, streamText, generateObject, or streamObject calls. The ingest layer maps Vercel AI SDK span attributes to the Trodo span schema: ai.model.id to model name, ai.usage.promptTokens and ai.usage.completionTokens to token counts, ai.toolCall.name and ai.toolCall.args to tool name and input, and ai.response.text to LLM output. Trodo infers a run boundary from the root span of each call tree, grouping multi-step useChat pipelines under a single agent run. A Next.js App Router quickstart is available in the developer docs under Integrations.Additional provider auto-instrumentationSDK 2.4.0 ships auto-instrumentation modules for five providers in addition to the Anthropic and OpenAI modules released in March.

Amazon Bedrock — InstrumentBedrock / instrument_bedrock wraps @aws-sdk/client-bedrock-runtime and boto3. Intercepts InvokeModel and InvokeModelWithResponseStream calls and records model ID, token counts, latency, and finish reason. Model IDs are mapped to the Trodo pricing table for cost estimates.
Google Generative AI — InstrumentGoogle / instrument_google wraps @google/generative-ai and google-generativeai. Token counts are read from usageMetadata on the response. Streaming calls produce a single span covering the full stream duration.
Cohere — InstrumentCohere / instrument_cohere wraps cohere-ai and cohere. Both generate and chat endpoints are instrumented. The command ID and finish reason are recorded as span attributes.
Mistral — InstrumentMistral / instrument_mistral wraps @mistralai/mistral-node-client and mistralai. Function-call responses record the function name and arguments as span attributes.
HTTP fetch — InstrumentFetch / instrument_fetch intercepts fetch (Node 18+) and requests (Python) for providers not covered by a named module. When the target host matches a configurable allowlist of LLM API hostnames, the interceptor parses the JSON request body to extract model name and messages, and the response body to extract token counts.

All modules are opt-in and compatible with the registerOtel OTel bridge from SDK 2.4.0. Spans from auto-instrumented calls carry trodo.auto_instrumented = true in OTLP exports.

May 2, 2026 — OTLP Ingest and SDK 2.4.0 OpenTelemetry Bridge

OTLP Ingest EndpointA new OTLP-compatible HTTP ingest endpoint is available at /v1/traces. It accepts trace payloads in both application/x-protobuf (the standard OTLP wire format) and application/json (the OTLP JSON encoding). Any OpenTelemetry SDK or auto-instrumentation library that supports HTTP OTLP export can send traces to Trodo with no Trodo-specific SDK required.Requests must include Authorization: Bearer <write_key> or the write key in an X-Trodo-Api-Key header. The ingest layer groups all spans sharing a trace ID into a candidate run: the root span (no parent ID) becomes the run record; child spans are stored as agent spans with parent relationships preserved. If a trodo.agent_name resource attribute is present it is used as the agent name, otherwise the OTel service name is used. Span attributes not recognized by the Trodo schema are stored in a metadata JSONB column and accessible in the span detail view under “Raw Attributes”. The endpoint is rate-limited at 1,000 requests per minute per write key; payloads exceeding 5 MB are rejected with 413 Content Too Large.SDK 2.4.0: registerOtelSDK 2.4.0 ships registerOtel (Node) / register_otel (Python), a new top-level export that bridges Trodo agent run semantics with the OpenTelemetry ecosystem.registerOtel accepts a mode parameter with three values. "trodo" (the default when no OTel SDK is detected) records spans via the existing wrapAgent / withSpan surface and exports directly to the Trodo ingest API — identical to pre-2.4.0 behavior. "otlp" configures a minimal OTel SDK using @opentelemetry/sdk-node (Node) or opentelemetry-sdk (Python) as a peer dependency and registers the Trodo OTLP endpoint as the exporter. "coexist" attaches Trodo’s OTLP exporter to an existing OTel SDK already configured by the host application, so that agent-relevant spans are received by Trodo without displacing the existing exporter.When called without a mode argument the SDK auto-detects: if OTEL_EXPORTER_OTLP_ENDPOINT is set it defaults to "coexist"; if @opentelemetry/sdk-node is importable as a peer dependency it defaults to "otlp"; otherwise it defaults to "trodo". Regardless of mode, the SDK maps semantic conventions used by OpenAI, Anthropic, LangChain, and Vercel AI SDK telemetry to the Trodo span schema, so spans from third-party instrumentation libraries appear correctly in the run explorer without manual attribute remapping.Upgrade via npm install @trodo/[email protected] or pip install trodo-sdk==2.4.0.

April 2026

April 27, 2026 — MCP Runless Spans, SDK 2.3.1, and Chat Agent Tools

MCP: Runless Spans ArchitectureThe MCP server integration has been re-architected to remove the requirement that MCP tool calls belong to an agent run. Previously, a synthetic run was created per Claude conversation session and all tool call spans were attached to it. MCP tool calls now create a single span in agent_spans with no parent run record required, recording the tool name, input, output, duration, and the conversation_id extracted from the MCP request context. A user_id is materialized if the requesting Claude session is linked to a known Trodo user identity.Per-span billing replaces per-run billing for MCP. Each span carries a cost computed from its duration and a per-millisecond rate. The Token and Cost Analytics page now has a “MCP Tools” tab showing cost by tool name, by day, and by user. A Redis-backed session sweeper that previously detected conversation end to close synthetic runs has been removed, reducing infrastructure dependencies for self-hosted deployments. A migration adds a conversation_id column to agent_spans with a partial index on (conversation_id, tool_name); existing spans have conversation_id = NULL.SDK 2.3.1: trackMcp / track_mcpSDK 2.3.1 introduces a dedicated primitive for recording MCP tool calls. trackMcp(options) (Node) and track_mcp(options) (Python) accept toolName, input, output, durationMs, userId, conversationId, and an optional metadata object. The call creates a single span of kind mcp_tool associated with the current active run if one exists, or as a standalone runless span if called outside a wrapAgent context. Spans are added to the same outbound batch queue used by withSpan and flushed every 2 seconds or when the batch reaches 100 items, preventing MCP-heavy workloads from flooding the ingest API with one HTTP request per tool call. Upgrade via npm install @trodo/[email protected] or pip install trodo-sdk==2.3.1.Chat: eval tools, agent run tools, and cross-domain query chainingThe Chat interface gains access to the full evaluations and agent run query surfaces. Eval tools allow querying evaluator configurations, retrieving results for a specific agent or date range, listing pending human evaluation items, submitting grades, and skipping queue items. Agent run tools cover listing runs with filter and sort, fetching full run detail including all spans and attributes, computing token cost breakdowns, and retrieving tool call analysis reports.Cross-domain query chaining lets a single question combine results from both domains. For example, “compare the retention rate of users who had failed runs last week vs. those who didn’t” triggers list_agent_runs (failed, last week), get_users_for_event to resolve user IDs, then run_retention_query scoped to those users. The final response presents all three results together. For queries where interpreting an intermediate result requires human judgment before the next tool call, Chat presents a confirmation step the user can confirm or redirect.

April 24, 2026 — MCP Server: Tool Catalog, Auth, and Quickstarts

MCP ServerTrodo is now available as an MCP (Model Context Protocol) server, exposing the full analytics and agent observability surface to any MCP-capable LLM client.Tool catalog — The MCP server exposes over 70 tools across eight categories: Analytics (run_insights_query, run_funnel_query, run_retention_query, run_flow_query, get_property_distribution, list_event_names, list_event_properties, get_users_for_event); Agent Runs (list_agent_runs, get_agent_run, search_agent_runs, get_run_metrics, get_token_cost_breakdown, get_tool_call_analysis, get_agent_feedback_summary); Evaluations (list_evaluators, get_evaluator, get_eval_results, list_pending_human_evals, submit_human_eval_grade, skip_human_eval); Issues (list_issues, get_issue_details, get_issue_members, get_issue_timeline, set_issue_status, get_top_failure_modes, get_top_failing_tools); Clusters (list_use_case_clusters, get_cluster_summary, get_cluster_runs); Users (get_user_profile, get_user_journey, get_user_agent_runs, find_users, get_top_users); Heal (report_heal_branch, get_anomaly_detection); Evaluator Management (backfill_evaluator, toggle_evaluator, test_evaluator).Authentication — Teams can register an OAuth application from Settings > Integrations. The authorization server at /auth/oauth supports the authorization code flow with PKCE, issuing access tokens (default 1-hour expiry) and refresh tokens (30-day expiry). For simpler setups, a static API key prefixed with trk_ can be generated from Settings > API Keys. Tools are grouped into six permission scopes (analytics:read, analytics:write, agent:read, agent:write, eval:read, eval:write); tokens can be restricted to a subset of scopes. High-fanout tools are rate-limited at 60 calls per minute per token.Quickstarts — The developer docs include setup guides for Claude Web (project configuration via pasted server URL), Claude Desktop (JSON config file), Claude Code (.mcp.json in the project directory), and Cursor (.cursor/mcp.json). Each quickstart covers authentication, connection verification, and three example prompts demonstrating cross-tool queries.

April 21, 2026 — Additional Issue Detectors and Severity Scoring

Issues: output_quality, ux_rage, and tool_misuse DetectorsThree additional issue detectors ship this week, expanding the automatic monitoring coverage beyond tool failures and conversation breakdowns.output_quality — Evaluates the final output of each agent run using a language model judge that checks completeness (did the agent address all parts of the request?), factual coherence (are statements internally consistent?), and format adherence (does the output match the expected structure?). Runs scoring below 0.5 are flagged. An issue is created when the flagged rate exceeds 15 % of runs in any 24-hour window, or when the 7-day rolling average drops below 0.4. Runs where the judge’s confidence is below 0.6 are placed in the human evaluation queue automatically. The quality threshold, rubric dimensions, and judge model are editable from Settings > Issues.ux_rage — Monitors agent runs for conversation-level frustration signals: repetition of a semantically equivalent request within the last five turns, explicit negative feedback phrases, very short replies following a long agent response (indicating dismissal), and session abandonment within 30 seconds of the agent’s last response. A composite frustration score is computed per run as a weighted sum of signal counts divided by total turns. An issue is created when five or more runs from the same agent exceed the threshold (default 0.45) within a seven-day window. The issue detail view includes a “Frustration Timeline” showing per-turn signal scores across all member runs.tool_misuse — Identifies cases where a tool is called with structurally valid but semantically incorrect arguments: the tool executes without error, but a lightweight LLM evaluation of the arguments against the tool’s description and expected semantics assigns a misuse score above 0.65. When at least three distinct runs contain misuse candidates on the same tool, an issue is created. This detector is particularly useful for identifying systemic prompt engineering problems where the model consistently misunderstands a tool’s purpose.Polymorphic issue members — The agent_issues_members join table now supports three member types: run, span, and event. The output_quality and tool_misuse detectors attach at the span level, giving issue detail views precise attribution rather than implicating the entire run.Z-score severity scoringIssue severity is now computed dynamically using a z-score against each detector’s 28-day historical event rate: z = (current - mean) / stddev. Issues with z > 2 are rated Elevated; z > 3 are Critical. Issues with fewer than 7 days of history fall back to an absolute threshold. When an issue’s severity drops below Elevated and then returns above it within 14 days, a “Regression” badge is applied to the issue and its timeline. The Heal system uses this signal as an automatic re-analysis trigger. The Issues list now sorts by current z-score descending by default.

April 18, 2026 — Issues Framework

Issues: tool_failure, conversation_breakdown, List View, and Status LifecycleThe Issues system is now live. It continuously monitors completed agent runs and surfaces systematic failure patterns as issues requiring operator attention.tool_failure detector — Reads the status field on tool-type spans. Spans with status = "error" are tool failure candidates. When three or more runs fail on the same tool within a 24-hour window, an issue is created, including the tool name, agent name, most common error messages (deduplicated by prefix), affected run IDs, first-seen and last-seen timestamps, and a severity score derived from the failure rate relative to total call volume. An initial backfill job runs over the past 30 days of spans so teams see issues populated immediately rather than waiting for new runs.conversation_breakdown detector — Applies three heuristics to the full transcript of each multi-turn run: loop detection (three or more turns where consecutive agent response similarity exceeds 0.92); contradiction detection (an NLI classifier identifies pairs of agent statements within the same run that are logically inconsistent); and incomplete termination (the run ends with uncertainty markers in the agent’s final turn without a preceding tool call that would justify the uncertainty). Runs satisfying at least two of the three heuristics are scored as breakdown candidates. An issue is created when five or more runs from the same agent match within a seven-day window. Offending turn pairs are recorded as span-level members, linking the issue detail directly to the specific spans where breakdown occurred.Issues list and detail — The Issues section is now in the main navigation. The list shows issue type, agent name, first and last seen dates, severity, affected run count, and status, with filters for type, agent, status, and date range. The detail page includes a severity sparkline, a “Member Runs” table with links to the full run detail page, a 30-day event count bar chart, and a “Detector Output” section showing the raw detector findings. Issues support four statuses — open, acknowledged, resolved, and muted (suppressed for 30 days, re-opening automatically if events continue). Status changes are logged in the timeline with the operator’s name and timestamp.

April 13, 2026 — Clustering, Use Cases, and Chat Eval Tools

Clustering: Semantic Run Clustering, UMAP Canvas, and Use Cases ViewAgent runs are now automatically clustered by semantic content, grouping runs that address similar user intents and surfacing use-case patterns that would otherwise require manual labeling.Embedding — After each agent run completes, a background job encodes the run’s input as a 1536-dimension embedding using text-embedding-3-small, stored in the agent_runs table.Centroid assignment — Nightly, a sweep computes the cosine distance from each unassigned run’s embedding to all existing cluster centroids for its agent. If the nearest centroid is within the distance threshold (default 0.25), the run is assigned to that cluster. If no centroid is within threshold, a new cluster is seeded with the run’s embedding as its initial centroid.Merge sweep — Nightly, pairs of clusters whose centroids are within 0.15 cosine distance of each other are merged. The merged centroid is the weighted average weighted by run count. Merges are logged for audit purposes.Topic labeling — After a cluster accumulates 10 or more runs, a language model produces a 2–5 word topic label summarizing the common intent of a sample of 10 runs. Labels are refreshed when the cluster’s run count doubles or a merge changes its centroid by more than 0.05.UMAP 2D canvas — UMAP is run nightly with n_neighbors=15 and min_dist=0.05 to produce 2D coordinates stored alongside the cluster assignment. The canvas is a WebGL-accelerated scatter plot where points are sized by run duration and colored by cluster. Cluster centroids are shown as larger filled circles with topic labels. The canvas is pan-and-zoom enabled and supports a merge sweep overlay showing which runs changed cluster assignment, connected to their new centroid with a thin line.Use Cases view — The Use Cases section under Agent Analytics presents the cluster directory for each agent: topic label, run count, percentage of total runs, and date range. A coverage badge shows what percentage of the agent’s total runs have been assigned to a cluster. Unclustered runs are shown separately. Each cluster retains a snapshot history of its centroid and label across nightly sweeps.Chat: evaluations and agent run toolsEval tools allow querying evaluator configurations, retrieving results for a specific agent or date range, listing pending human evaluation items, submitting grades, and skipping queue items. Agent run tools cover listing runs with filter and sort, fetching full run detail including all spans and attributes, and computing token cost breakdowns. When a question requires both agent run data and user event data, Chat calls tools from both domains in sequence and presents all results together in a single response.

April 8, 2026 — Evaluations

Evaluations: LLM Judge, Code Evaluator, Human Grading Queue, and CompositeThe Evaluations section launches today. It provides a systematic, repeatable way to score the quality of agent outputs across runs.LLM judge evaluator — Sends each run’s input and output to a language model acting as an impartial scorer. A default rubric checks task completion, response quality, and safety. The judge returns a 0–1 score and a one-sentence rationale. The judge prompt, model (defaulting to claude-haiku-4-5), scoring rubric, pass threshold (default 0.7), and schedule (on completion, hourly, daily, or manual) are all configurable per evaluator.Code evaluator — Executes a user-defined JavaScript or Python function against the agent’s input, output, and tool call results in a sandboxed environment (isolated V8 context for Node; restricted subprocess for Python) with a 5-second execution timeout. Useful for deterministic checks: JSON schema validation, regular expression matching, assertion of numeric ranges, or custom business-logic rules that do not require a language model.Human grading queue — When an evaluator is configured with type "human", matching runs are placed into a grading queue. Graders access the queue from the Evaluations section and see one run at a time: the input, the agent’s output, any tool calls made, and the grading rubric. They assign a score from 1 to 5 (normalized to 0–1) and an optional comment. The queue shows pending item count, estimated grading time from the median of the last 100 graded items, and an item assignment system that prevents duplication.Composite evaluator — Aggregates scores from two or more child evaluators using configurable weighting. The composite result is Pass if and only if all child evaluators pass.Results surface — Each evaluator has a results page showing a 30-day time-series chart of average scores, a score distribution histogram, and a table of recent results with the judge’s rationale. The run detail page now includes an “Evaluations” panel showing all eval results for that run. Evaluators can be scoped to a subset of runs using event filter expressions and can be backfilled over historical runs up to 90 days.

April 6, 2026 — Boards and Cohorts

Boards: Pinnable Analytics DashboardsBoards allow teams to pin any combination of Insights charts, Funnel results, Retention curves, and Flow paths to a shared dashboard that persists across sessions. Every chart rendered on these surfaces has a “Pin to Board” button in its context menu. The chart is saved with its full query configuration so that it recomputes live each time the board loads.Boards use a grid layout with three column widths (1/3, 2/3, full width) and variable row heights; charts can be repositioned and resized by dragging. Boards can be shared with individual team members or made visible to the entire team. Shared boards are read-only for non-owners. A public share link (no login required, read-only) can be generated for external stakeholders. Auto-refresh intervals of 5 minutes, 15 minutes, 1 hour, or manual update all charts in parallel without a full page reload.Cohorts: Saved User SegmentsCohorts allow teams to define persistent user segments based on property filters and behavioral criteria, then reuse those segments as filters across all analytics surfaces.Cohorts are defined using a filter builder that supports AND/OR groupings of property conditions and behavioral conditions such as “performed event checkout_completed at least once in the last 30 days” or “did not perform event churned in the last 90 days”. Static cohorts capture the user set at definition time. Dynamic cohorts recompute on a schedule (hourly, daily, or weekly) and always reflect the current state of the filter conditions; membership is stored as a materialized list of distinct_id values for fast join performance. Any filter dropdown in Insights, Funnels, Retention, Flows, the Agent Runs list, or Evaluations results now includes a “Cohorts” section. A cohort comparison view shows a Venn diagram of user overlap between any two selected cohorts with counts and percentages.

April 5, 2026 — SDK 2.3.0, LangChain, and LlamaIndex

SDK 2.3.0: Long-Session Run PrimitivesSDK 2.3.0 introduces three primitives for managing agent runs that span multiple processes, threads, or HTTP requests.startRun(options) / start_run(options) creates a run record and returns a run ID, immediately visible in the Agent Runs list with status in_progress. joinRun(runId) / join_run(run_id) returns a run context object for an existing run ID; subsequent withSpan calls within the returned context are attached to the specified run as child spans regardless of whether the calling process started the run, enabling worker processes to contribute spans to a run started by a dispatcher. endRun(runId, options) / end_run(run_id, options) closes an existing run, recording its final status, output, and metadata, and finalizes the token cost summary. wrapAgent is unchanged; the two styles can coexist within the same codebase.Upgrade via npm install @trodo/[email protected] or pip install trodo-sdk==2.3.0.LangChain JS, LangChain Python, and LlamaIndex auto-instrumentationA TrodoCallbackHandler class is available from @trodo/sdk/integrations/langchain (Node) and trodo_sdk.integrations.langchain (Python). It implements the respective framework’s callback interface and captures LLM calls, tool calls, chain executions, and retrieval operations as Trodo spans without manual withSpan wrapping. Callback events map to Trodo span kinds: handleLLMStart / handleLLMEnd (or their Python equivalents) produce an llm span with model name, prompt, completion, and token counts; handleToolStart / handleToolEnd produce a tool span; handleRetrieverStart / handleRetrieverEnd produce a retrieval span with the query and top-k retrieved document excerpts. For streaming LLM calls, tokens are accumulated and the span is closed with the full completion on end. The handler is compatible with LCEL pipelines in Node and LangChain expression chains in Python.A TrodoEventListener class is available from trodo_sdk.integrations.llama_index (Node and Python). It implements the LlamaIndex event listener interface and maps LLMCompletionStartEvent / LLMCompletionEndEvent, ToolCallEvent, RetrieveEvent, and QueryStartEvent / QueryEndEvent to Trodo span kinds. Register it via Settings.callback_manager.add_event_listener(TrodoEventListener()).When called within an active wrapAgent context, spans from both integrations are nested under the current run’s trace. When called outside any run context, they create a standalone trace.

April 1, 2026 — Chat: Inline Charts and Report Cards

Chat: Inline Chart Rendering and Report CardsQuery results in Chat now render as interactive charts inline in the conversation rather than raw JSON or plain text tables.The underlying language model selects the chart type appropriate for the result structure: line charts for time series, bar charts for categorical breakdowns, funnel visualizations for multi-step conversion results, and horizontal bar charts for ranked lists. A chart type toggle in the result card allows the user to switch between chart and table views. Aggregate results (total counts, averages, conversion rates) are rendered as compact metric cards with a value, a label, and a delta indicator if a prior period comparison is available.Each result card has a “Copy data” button that places the underlying result as a tab-separated table on the clipboard. A “Pin to Board” button saves the chart to a Board. Charts are rendered client-side using the same charting library used across the rest of the dashboard, ensuring visual consistency.

March 2026

March 30, 2026 — Chat: Natural Language Analytics Interface

Chat: Natural Language Analytics and Agent Observability InterfaceThe Chat interface launches today. It provides a conversational entry point to all Trodo analytics and agent observability data, accepting free-form questions and returning results drawn from the full data surface.Tool-based architecture — Chat operates by injecting callable tools into the underlying language model’s context. Each tool corresponds to a Trodo API endpoint spanning analytics queries, agent run exploration, user identity, and session queries. The model selects and calls the appropriate tools to answer each query, then interprets the results and writes a human-readable response. When a question requires combining multiple data sources, the model calls tools in sequence, passing intermediate results forward.Event catalog injection and fuzzy matching — At the start of each Chat session, the system injects a compressed event catalog listing all event names, their property schemas, and last-seen timestamps, allowing the model to resolve event references by description rather than exact name. When a query mentions an event name that does not exactly match any event in the catalog, fuzzy string matching suggests the closest match. If a unique best match is found, it is used automatically with a note in the response. Property names and their 20 most common observed values are included in the injected catalog so the model can construct valid filter expressions without the user specifying property names exactly.Session history — Each Chat session is persisted. The left sidebar shows prior sessions with their first message as a title. Sessions can be renamed, pinned, or deleted. Returning to a session restores the full conversation history and all prior tool result context so follow-up questions can build on prior results without re-fetching data.

March 27, 2026 — SDK 2.1.0 GA, Retention, and Flows

SDK 2.1.0 Generally Available: Anthropic and OpenAI Auto-InstrumentationSDK 2.1.0 exits beta and is generally available for Node.js and Python. All core agent SDK functions — wrapAgent, withSpan, tool, llm, retrieval, feedback — are now stable within the 2.x series.InstrumentAnthropic (Node) wraps the @anthropic-ai/sdk client at the transport layer. Every messages.create call is automatically wrapped in an llm span recording the model name, the full messages array, the completion, and the usage object including cache read and cache creation token counts. InstrumentOpenAI (Node) wraps the openai npm package, instrumenting chat.completions.create, completions.create, and embeddings.create. createTrodoMiddleware() returns an Express middleware that extracts a user ID from the X-Trodo-User-Id request header (or from a JWT sub claim if jwtSecret is configured) and sets it as the active user identity for any agent runs initiated within the request handler, eliminating the need to manually forward user IDs from the HTTP layer to agent code.Retention: Cohort Return Rate AnalysisThe Retention surface computes cohort return rates: for users who performed a specific starting action in a given time window, what fraction returned to perform a specific return action in each subsequent period? A query requires a starting event, a return event, and a time granularity (day, week, or month). Results are displayed as a retention triangle where rows are cohorts, columns are elapsed periods, and cells show the return rate color-coded from white (0 %) to dark blue (100 %). An average retention curve below the triangle summarizes the overall shape across all cohorts. Breakdown dimensions split the triangle by acquisition channel, plan tier, or any string property.Flows: Path Analysis and Sankey VisualizationThe Flows surface visualizes the sequences of events users take before or after a specified seed event. The query engine computes the N most frequent events that immediately follow (forward) or immediately precede (backward) the seed within the same session, recursively expanding the most frequent next step up to a configurable depth (default 5 steps). Results render as a Sankey diagram where node width is proportional to user count at that step and edge thickness to the users who made that specific transition. Paths that terminate before reaching the configured depth appear as drop-off nodes.

March 23, 2026 — Developer Docs and Groups SDK

Developer Documentation SiteThe Trodo developer documentation site launches today, organized into six top-level sections.Getting Started — A five-minute quickstart covering account creation, write key retrieval, and sending the first event, with a copy-pasteable HTML snippet and one-liners for Node.js and Python server apps.Event Analytics SDK — Full reference for the JavaScript and Python SDKs: init, track, identify, page, group, reset, alias, auto-events configuration, session management, privacy controls, and the batch ingest API.Agent Analytics SDK — Full reference for wrapAgent, withSpan, tool, llm, retrieval, feedback, startRun, joinRun, endRun, and all auto-instrumentation modules, with architecture diagrams showing how spans relate to runs and how runs relate to user identity.Integrations — Setup guides for LangChain (JS and Python), LlamaIndex, Vercel AI SDK, OpenTelemetry OTLP, Amazon Bedrock, Google Generative AI, Cohere, Mistral, and HTTP fetch instrumentation.MCP Server — Tool catalog reference, authentication guide, and quickstart configurations for Claude Web, Claude Desktop, Claude Code, and Cursor.Recipes — Ten end-to-end examples: dual-export, cross-service tracing, sub-agent patterns, multi-model routing, evaluation-driven prompt tuning, cohort-based A/B analysis, and more.Groups SDK: B2B Organization Trackinggroup(groupType, groupId, properties) associates the current user with a named organizational entity. groupType names the kind of entity (e.g., "company", "workspace"); groupId is the unique identifier for the specific instance; properties is a flat object of group attributes such as name, plan, industry, and employee count. Calling group() sends a $groupidentify event that creates or updates the group record. All subsequent events from the user are enriched with the group’s current properties, accessible in Insights breakdowns and Funnel filters as group.<property_name>. A user can belong to groups of multiple types simultaneously. The Groups section in the main navigation lists all groups of each type with member count, first-seen date, and last-seen date; each group has a detail page showing properties and a list of member users.

March 20, 2026 — Agent Runs Dashboard and Span Waterfall

Agent Analytics: Runs List, Span Waterfall, Run Detail, and Cost AnalyticsThe Agent Analytics section is now available from the main navigation.Runs list — A paginated table showing all agent runs with columns for run ID, agent name, status (completed, failed, in_progress, cancelled), start time, duration, input and output token counts, cost estimate, and user identity linked to the user profile if a userId was provided. Column visibility is configurable per-user and persists across sessions. Filtering by agent name, status, date range, user ID, and run duration range is available; active filters are shown as removable chips. A summary card above the table shows total runs, success rate, median duration, median cost per run, and a daily run count sparkline, all updating live as filters change. A “Traces” tab shows all spans across all runs, filterable by span kind, tool name, and duration.Span waterfall — Each span is rendered as a horizontal bar whose left edge is its start time relative to the run start and whose width is proportional to its duration. Child spans are indented below their parents. Spans are color-coded by kind: LLM calls in blue, tool calls in amber, retrieval in green, generic in grey, error spans outlined in red. Clicking any span bar opens a slide-out panel showing the full span record: name, kind, status, timestamps, duration, parent span ID, model name, token counts, tool name, input/output excerpts, error message, and raw metadata.Run detail — The detail page shows a metadata card (status, agent name, start/end time, duration, linked user profile), a stats row (token counts including cache read and cache creation, total cost, span count, tool call count, LLM call count), the embedded span waterfall, and raw input/output in syntax-highlighted JSON viewers with copy buttons. A feedback widget allows operators to submit a thumbs up/down rating and optional free-text comment, stored in agent_run_feedback and accessible via the SDK’s feedback API.Token and cost analytics — A dedicated Token and Cost Analytics page shows total spend broken down by agent name, model name, and day. Each agent row shows total input tokens, total output tokens, total cost, cost per run, and cost per successful run. A configurable pricing table covers all major models; per-run cost breakdowns are itemized by model for multi-model runs. Daily and monthly spend alerts send a Slack notification (if a webhook is configured) and show a dashboard banner when crossed.

March 16, 2026 — Auto-Events SDK v2

Auto-Events SDK v2: 15 Captured Event Types, Form Events, Errors, and Page PerformanceThe auto-events SDK has been rewritten to capture 15 event types automatically with no configuration beyond passing autoEvents: true to init().Interaction events — Element clicks ($click), rage clicks (rage_click), dead clicks (dead_click), text selection (text_select), and element view (element_view) are captured. Element click events include the CSS selector path, text content truncated to 64 characters, the element’s bounding box position, and whether the element is interactive. A rage click is three or more clicks within 600 milliseconds within a 50 px radius; a dead click is a click that produces no DOM mutation within 750 milliseconds. The element_view event fires once per element per page view session via IntersectionObserver. All element-interaction events include an Element Data property group with selector, tag, id, class, text, href, and position.Form events — form_start fires on first field focus; form_submit fires on form submission; form_abandon fires when the user navigates away from a started but unsubmitted form, including the time spent and the last field with focus. field_focus and field_blur are tracked per field, recording field_name, field_type, and whether the field was empty when focus was lost. form_validation_error fires for each field that fails an HTML5 constraint, recording the field name and constraint type.JavaScript and network errors — Unhandled exceptions and unhandled promise rejections are captured as js_error events with the error message, stack trace (first 2,000 characters), and throw-site location. Failed HTTP requests (4xx, 5xx, network timeouts, CORS failures) are captured as network_error events by wrapping native fetch and XMLHttpRequest, recording the sanitized URL, HTTP method, status code, and response time. A per-session throttle of 10 errors prevents flooding.Page performance — A page_performance event is emitted once per page view with Web Vitals metrics: LCP, FID, CLS, TTFB, and FCP, consistent with what Google’s web-vitals library reports.Runtime API — Trodo.autoEvents.enable() / disable() toggle the system at runtime (useful for consent flows). Individual types can be toggled with enableType('rage_click') / disableType('page_performance'). Per-event throttles are overridable at runtime via setThrottle. A beforeCapture(eventType, properties) => properties | null hook provided at init time or set at runtime fires before each event is sent; returning null discards the event, providing a PII scrubbing and conditional suppression hook.Media events — media_play, media_pause, and media_ended are captured for <video> and <audio> elements, including current playback position, duration, and whether the action was user-initiated or autoplay.

March 14, 2026 — Agent Analytics SDK Beta

Agent Analytics SDK Beta: wrapAgent, withSpan, and Span TypesThe Agent Analytics SDK is now available in beta for Node.js (@trodo/sdk) and Python (trodo-sdk). It provides primitives for instrumenting AI agent workloads and sending structured telemetry to the Agent Analytics section of the dashboard.wrapAgent(name, fn, options) is the primary entry point. It wraps an async function as a named agent run: the wrapper creates a run record before the function executes and closes it with status and output when the function resolves or rejects. Options: userId, input, metadata. withSpan(name, fn, options) creates a child span within the current active run context, propagated via AsyncLocalStorage (Node) or context var (Python) without requiring explicit context passing. Options: kind (tool, llm, retrieval, or generic), input, output, metadata. Convenience wrappers tool(name, fn), llm(name, fn), and retrieval(name, fn) call withSpan with the appropriate kind. If the wrapped function throws, the span is closed with status: "error" and the error message and stack trace recorded as span attributes; the error is re-thrown so normal error handling is unaffected. feedback(runId, options) records user or operator feedback on a completed run: score (0–1), label (positive or negative), comment, userId.

March 9, 2026 — Analytics: Insights, Funnels, Sessions, and Event Catalog

Insights, Funnels, Sessions, Event Catalog, and Privacy ControlsFour analytics surfaces and supporting infrastructure launch this week.Insights — The core analytics query interface for exploring event volume, unique user counts, and aggregate numeric metrics over time. Metric types: event count, unique users, formula (a mathematical expression combining other metrics), and property aggregation (sum, average, median, percentile, min, or max of a numeric property on matching events). Queries can be scoped to a preset or custom date range with independently selectable time granularity (hour, day, week, month). Any string property can be used as a breakdown dimension, returning one time series per distinct value up to a configurable maximum. Filters use AND/OR groupings of property conditions. Saved queries appear in a sidebar and can be added to Boards.Funnels — Defines sequences of 2–10 event steps. Each step specifies the event name and optional property filters; steps must be completed in order within a configurable conversion window (default 7 days). The funnel displays the number of users who reached each step, the number who converted to the next, the conversion rate, and the median time to convert. Any string property can be used as a breakdown dimension. A “Dropped off users” link at each step boundary opens a filtered user list that can be exported or saved as a cohort.Sessions page — A paginated table of all sessions with columns for session ID, user identity, first and last event times, duration, page view count, event count, device type, browser, OS, country, and first referrer. The session detail view shows a chronological event timeline; each event expands to show its full property set. Session records include user agent parsing (browser name and version, OS name and version, device type and model) computed server-side at ingest time. UTM parameters and the initial referrer URL are captured at session creation and preserved for the session lifetime for first-touch attribution.Event catalog — A browsable registry of all event types that have been sent to the project. Events are auto-discovered within 60 seconds of first occurrence. Operators can add human-readable descriptions to any event and any of its properties, visible as tooltips throughout the dashboard and injected into Chat sessions as context. String properties with fewer than 500 distinct values show a frequency table of the top 20 observed values. Events can be marked Hidden (excluded from autocomplete but still queryable) or Verified / Unverified.Geo enrichment and privacy controls — All incoming events and sessions are enriched with country, city, region, and rounded latitude / longitude derived from the client IP using a locally hosted MaxMind GeoLite2 database. The raw IP is not stored after enrichment. When the DNT: 1 header is present, event ingest is halted for that session. A beforeCapture hook and a hashPII(value) SHA-256 HMAC utility support PII scrubbing at the SDK layer. Per-event-type data retention can be configured from 30 days to 7 years in Settings > Privacy.

March 3, 2026 — JavaScript and Python SDK Beta

JavaScript SDK Beta and Python SDK BetaThe Trodo JavaScript SDK (@trodo/sdk) and Python SDK (trodo-sdk) are now available in beta.JavaScript SDK — Available as a self-contained CDN loader snippet (under 500 bytes before gzip) and as an npm package with ES module and CommonJS dual-export supporting tree-shaking. init(writeKey, options) initializes the SDK. track(event, properties) records a named event. identify(userId, traits) links the current session to a known user and merges traits into the user’s profile; a UUID anonymousId is generated on first load and persisted in localStorage with a cookie fallback, and historical events are retroactively linked when identify() is called. page(name, properties) records a page view; passing capturePageViews: true enables automatic page view events on every navigation including soft navigations in single-page apps via History API monkey-patching. group(groupType, groupId, properties) associates the current user with an organizational entity. reset() clears the current identity for logout flows. The SDK creates a session on first event and maintains continuity across page loads within a 30-minute inactivity window; session ID, start time, and session number are included on every event. UTM parameters (utm_source, utm_medium, utm_campaign, utm_content, utm_term) are read from the URL query string on initialization and included on every event in the session.Python SDK — Available on PyPI. track(distinct_id, event, properties), identify(distinct_id, traits), group(distinct_id, group_type, group_id, properties), and page(distinct_id, name, properties) are non-blocking: events are added to an in-memory queue and flushed to the batch ingest API by a background daemon thread every 200 milliseconds or when the queue reaches 100 items, with exponential backoff on API errors. flush() is provided for serverless environments to ensure all queued events are sent before the process exits. TrodoMiddleware for Django and Starlette/FastAPI automatically sets the active user identity from the X-Trodo-User-Id request header and records page events for every incoming request.Batch ingest API — The /batch endpoint accepts arrays of up to 500 events (max 500 KB) per POST. Each event requires a type (track, identify, page, group, or alias), an ISO 8601 timestamp, a context object identifying the sending SDK, and either anonymousId or userId. Invalid events are rejected individually without failing the entire batch. The endpoint is authenticated via Authorization: Basic <base64(writeKey:)> or the X-Write-Key header. Rate limiting is 10,000 events per minute per write key with 429 Too Many Requests and a Retry-After header on exceedance. Events with a messageId field are deduplicated within a 24-hour window. alias(userId, previousId) creates a permanent link between two user IDs, treating them as the same user in all queries.

Documentation Index

​May 2026

​April 2026

​March 2026

May 2026

April 2026

March 2026