Skip to main content
Tutorial7 min read

Agent token usage analytics: build a dashboard your finance team will love

Build a dashboard that answers how much each user costs you per month, broken down by model, feature, and cache hit rate. SQL schema plus the five charts that matter.

"How much did this user cost us last month?" used to be answerable by SaaS pricing alone. With agents in production, the answer is "depends on which tools they used, which model picked them up, and whether the prompt cache was warm." Here is how to build the dashboard that surfaces it.

The minimum data you need

Every Anthropic API response carries a usage object. Capture these five fields per call:

  • input_tokens — uncached input.
  • cache_read_input_tokens — input served from prompt cache (10% rate).
  • cache_creation_input_tokens — input that wrote to cache (1.25x rate).
  • output_tokens — model output.
  • model — which model the call hit.

Tag each with a session_id, user_id, and a tool_count if MCP was involved. That is enough to slice every interesting question.

The schema

CREATE TABLE agent_usage (
  id bigserial primary key,
  ts timestamptz not null default now(),
  user_id text not null,
  session_id text not null,
  model text not null,
  input_tokens int not null,
  cached_read_tokens int not null default 0,
  cached_write_tokens int not null default 0,
  output_tokens int not null,
  tool_count int not null default 0,
  feature text
);
CREATE INDEX ON agent_usage (user_id, ts);
CREATE INDEX ON agent_usage (model, ts);

The cost calculator

Do not store dollar amounts — store tokens, compute dollars in the dashboard. Pricing changes; tokens do not. A small SQL view:

CREATE VIEW agent_cost AS
SELECT
  ts, user_id, session_id, model,
  CASE model
    WHEN 'opus-4-7' THEN
      input_tokens * 0.000015
      + cached_read_tokens * 0.0000015
      + cached_write_tokens * 0.00001875
      + output_tokens * 0.000075
    WHEN 'sonnet-4-6' THEN
      input_tokens * 0.000003
      + cached_read_tokens * 0.0000003
      + cached_write_tokens * 0.00000375
      + output_tokens * 0.000015
    WHEN 'haiku-4-5' THEN
      input_tokens * 0.0000008
      + cached_read_tokens * 0.00000008
      + cached_write_tokens * 0.000001
      + output_tokens * 0.000004
  END AS cost_usd
FROM agent_usage;

The five charts that matter

1. Daily spend by model

Stacked bar. Spot the day the new feature shipped and Opus usage spiked.

2. Cost per active user

The metric your CFO actually cares about. SUM(cost_usd) / COUNT(DISTINCT user_id) over the last 30 days, broken down by user tier if you have one.

3. Cache hit rate

cached_read_tokens / (input_tokens + cached_read_tokens). Anything below 50% is leaving money on the table for repeat-query agents.

4. Top 10 most expensive sessions

Click into any of them, see the trace. Pairs naturally with a trace viewer.

5. Cost per feature / endpoint

Tag your calls with the product feature that triggered them. Now you can answer "is feature X profitable?" without guesswork.

Alerting

Two alerts pay for themselves immediately:

  1. Per-user daily spend > X dollars — catches runaway agent loops before they cost a paycheck.
  2. Daily total spend > 1.5x trailing 7-day average — catches a bad deploy or a pricing error.

Putting it on a dashboard

Three pragmatic stacks:

  • Metabase — point at Postgres, ship in an afternoon.
  • Grafana + Postgres — better if you already run Grafana for infra.
  • Hex / Mode — best if your finance team is already there.

Common pitfalls

  • Counting cached writes as input — they are 1.25x the input rate. Easy to get wrong, distorts cost views.
  • Ignoring failed calls — they still bill input tokens. Capture them.
  • Tagging only at one boundary — if a single user request fans out to 5 sub-agents, propagate the tags.
  • No feature tag — without it, you cannot answer the questions that actually drive product decisions.

Related reads

Loadout

Build your AI agent loadout

Directory
Contact
© 2026 Loadout. Built on Angular 21 SSR.