Observability
Every agent interaction is automatically captured by SentientOne. Full visibility into requests, responses, token usage, latency, and cost — across every conversation, every agent, every day. No instrumentation libraries, no tracing sidecars, no cost-calculation middleware.
What you get for free
Captured automatically · zero code from you
Every request, observedRange: 24hAgent: AllExport CSV
Requests
4,210
Tokens
1.24 M
Avg latency
1.42 s
Cost
$18.40
| Captured field | What it tells you |
|---|---|
| request_messages | Full conversation history sent to the LLM (up to 50 messages) |
| response_content | Complete LLM response text |
| prompt_tokens | Tokens consumed by the input / prompt |
| completion_tokens | Tokens consumed by the generated response |
| total_tokens | Sum across all LLM rounds including tool-use iterations |
| cost | USD cost computed from cost_per_1m_tokens |
| provider / model | LLM provider and model name used for this request |
| status | success or error, plus message if applicable |
| latency_ms | End-to-end including MCP tool calls |
| trace_id | X-Trace-Id header — pivot into Tracing |
Zero instrumentationPer-request + per-agentRolled up daily
You don't add a single line of logging to your application. Every call to /v1/chat is recorded the moment it lands, and the data flows straight into Analytics for trends and Tracing for per-request inspection.
What is captured per request
- request_messagesFull conversation history sent to the LLM (up to the last 50 messages). Includes the system prompt injection and any tool-use rounds.
- response_contentComplete LLM response text exactly as returned to your caller — useful when debugging unexpected agent behaviour or content-policy blocks.
- prompt_tokensTokens consumed by the input / prompt for this request, measured with the provider's native tokeniser.
- completion_tokensTokens consumed by the generated response. Combined with
prompt_tokensgives you the full picture of bill-impacting bytes. - total_tokensSum across every LLM round including MCP tool-use iterations — the number that drives cost attribution.
- costCalculated LLM cost in USD using the agent's Token Cost rates. Rolls up cleanly to per-agent, per-day, per-organisation totals.
- provider / modelWhich provider and model handled this request — Anthropic
claude-sonnet-4, OpenAIgpt-4o, etc. - status
successorerrorwith the error message if applicable. Lights up the status pill in Tracing. - latency_msEnd-to-end latency including all MCP tool calls. The latency breakdown bar decomposes it by step.
- trace_idUnique id surfaced as
X-Trace-Idon every response — paste it into a Tracing filter to find this exact request in seconds.
Why this matters at scale
- 1
One place, all agents
A typical organisation running 5–20 agents across multiple teams would need separate logging, monitoring, and cost-tracking implementations per app. SentientOne centralises all of this automatically. - 2
Audit and compliance
Every LLM interaction is logged with timestamps, user ids, and conversation ids — ready for compliance reviews, security audits, or dispute resolution. - 3
Cost control
Know exactly which agents, users, or workflows are driving AI spend before your LLM bill arrives. Set up alerts or impose limits at the agent level. - 4
Prompt-engineering feedback loop
Compare token counts and response quality across system prompt iterations to find the most efficient and accurate configuration for each agent.