Observability

Every agent interaction is automatically captured by SentientOne. Full visibility into requests, responses, token usage, latency, and cost — across every conversation, every agent, every day. No instrumentation libraries, no tracing sidecars, no cost-calculation middleware.

What you get for free

Captured automatically · zero code from you
Every request, observedRange: 24hAgent: AllExport CSV

Requests

4,210

Tokens

1.24 M

Avg latency

1.42 s

Cost

$18.40

Captured fieldWhat it tells you
request_messagesFull conversation history sent to the LLM (up to 50 messages)
response_contentComplete LLM response text
prompt_tokensTokens consumed by the input / prompt
completion_tokensTokens consumed by the generated response
total_tokensSum across all LLM rounds including tool-use iterations
costUSD cost computed from cost_per_1m_tokens
provider / modelLLM provider and model name used for this request
statussuccess or error, plus message if applicable
latency_msEnd-to-end including MCP tool calls
trace_idX-Trace-Id header — pivot into Tracing
Zero instrumentationPer-request + per-agentRolled up daily

You don't add a single line of logging to your application. Every call to /v1/chat is recorded the moment it lands, and the data flows straight into Analytics for trends and Tracing for per-request inspection.

What is captured per request

  • request_messagesFull conversation history sent to the LLM (up to the last 50 messages). Includes the system prompt injection and any tool-use rounds.
  • response_contentComplete LLM response text exactly as returned to your caller — useful when debugging unexpected agent behaviour or content-policy blocks.
  • prompt_tokensTokens consumed by the input / prompt for this request, measured with the provider's native tokeniser.
  • completion_tokensTokens consumed by the generated response. Combined with prompt_tokens gives you the full picture of bill-impacting bytes.
  • total_tokensSum across every LLM round including MCP tool-use iterations — the number that drives cost attribution.
  • costCalculated LLM cost in USD using the agent's Token Cost rates. Rolls up cleanly to per-agent, per-day, per-organisation totals.
  • provider / modelWhich provider and model handled this request — Anthropic claude-sonnet-4, OpenAI gpt-4o, etc.
  • statussuccess or error with the error message if applicable. Lights up the status pill in Tracing.
  • latency_msEnd-to-end latency including all MCP tool calls. The latency breakdown bar decomposes it by step.
  • trace_idUnique id surfaced as X-Trace-Id on every response — paste it into a Tracing filter to find this exact request in seconds.

Why this matters at scale

  1. 1

    One place, all agents

    A typical organisation running 5–20 agents across multiple teams would need separate logging, monitoring, and cost-tracking implementations per app. SentientOne centralises all of this automatically.
  2. 2

    Audit and compliance

    Every LLM interaction is logged with timestamps, user ids, and conversation ids — ready for compliance reviews, security audits, or dispute resolution.
  3. 3

    Cost control

    Know exactly which agents, users, or workflows are driving AI spend before your LLM bill arrives. Set up alerts or impose limits at the agent level.
  4. 4

    Prompt-engineering feedback loop

    Compare token counts and response quality across system prompt iterations to find the most efficient and accurate configuration for each agent.

Where to look at the data

Trends, charts, and per-agent breakdowns live in Analytics. Per-request inspection — full prompt, full reply, every tool call, every step — lives in Tracing. Both views read the same captured fields.