Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.trodo.ai/docs/llms.txt

Use this file to discover all available pages before exploring further.

What auto-instruments

Install opentelemetry-instrumentation-llamaindex alongside llama-index. Every retriever/query/chat call inside wrap_agent emits spans.
ConstructSpan kindAuto-extracted
VectorIndexRetriever.retrieveretrievalquery, doc count, doc ids
QueryEngine.queryagentwraps retrieval + synthesis
ChatEngine.chatagentwraps retrieval + LLM + memory
ResponseSynthesizerllmmodel, tokens
Node post-processorsgenericname per post-processor
Embeddingsllmmodel, tokens

Install

pip install llama-index opentelemetry-instrumentation-llamaindex

Minimum example — RAG query

import trodo, os
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader

trodo.init(site_id=os.environ["TRODO_SITE_ID"])

documents = SimpleDirectoryReader("./docs").load_data()
index = VectorStoreIndex.from_documents(documents)
query_engine = index.as_query_engine()

with trodo.wrap_agent("llamaindex-rag") as run:
    run.set_input({"q": "What is HNSW?"})
    response = query_engine.query("What is HNSW?")
    run.set_output(str(response))
Resulting tree:
run (wrap_agent)
 └─ agent   QueryEngine.query
      ├─ retrieval   VectorIndexRetriever.retrieve
      ├─ generic     post-processor: SimilarityPostprocessor
      └─ llm         ResponseSynthesizer (gpt-4o-mini)

Chat engine with memory

chat_engine = index.as_chat_engine(chat_mode="context")

with trodo.wrap_agent("support-bot", conversation_id=session_id) as run:
    for msg in incoming_messages:
        response = chat_engine.chat(msg)
        run.set_output(response.response)
Each turn is one wrap_agent run; the conversation_id stitches them into the session view.

Retrieval span details

The default retriever populates:
  • span.input → the raw query string
  • span.output → list of { id, score, text_preview } for each retrieved node
  • span.attributes.top_k → configured similarity_top_k
  • span.attributes.doc_count → how many nodes actually came back
If you post-process (rerank, filter), each post-processor adds a child span so you can see how the doc set narrows.

Auto vs manual cheat-table

OperationAuto?If manual
VectorIndexRetrieveryes
Custom BaseRetriever subclasspartialMust call super()._retrieve() for callbacks to fire; otherwise wrap in trodo.retrieval()
QueryEngine.queryyes
ChatEngine.chatyes
SubQuestionQueryEngineyesEach sub-question becomes a nested query span
ReactAgent / FunctionCallingAgentyesSteps appear as alternating llm + tool
Custom node post-processorsyesName taken from the class
Ingestion pipelinenoUsually offline — wrap in a separate run with wrap_agent("ingest-docs")

Gotchas

  • Only the Python SDK auto-instruments LlamaIndex. The JS port (llamaindex npm package) has no first-party OTel bindings — wrap retrievers with trodo.retrieval() and synthesiser calls with trodo.llm().
  • Token counts come from the underlying LLM (OpenAI, Anthropic). If you’re using a local LLM via CustomLLM, set extractUsage on a trodo.llm wrapper or call trodo.track_llm_call after each completion.
  • Streaming query engines (as_query_engine(streaming=True)) record the span when the stream finishes. Partial results appear only if you set span.set_output mid-stream.