Commit Graph

12529 Commits

Author SHA1 Message Date
Kit Langton
c20d070b9a docs(llm): fix stale references in protocols/shared.ts and gemini.ts
- subtractTokens JSDoc said the raw payload lives on Usage.native, but
  that field was renamed to providerMetadata earlier in this PR.
- totalTokens JSDoc still described the abandoned "additive" first-pass
  contract where inputTokens/outputTokens were non-cached / visible only.
  We landed on inclusive totals; the fallback already covers cache and
  reasoning.
- Removed a duplicate inline comment in Gemini's mapUsage — the
  function-level comment already explains the visible/reasoning sum and
  the undefined-when-incomplete rule.
2026-05-10 22:08:12 -04:00
Kit Langton
d048bd6f4b test(llm): re-record golden scenarios against live providers
Verifies the new Usage mapper code against live provider responses for
OpenAI Chat, OpenAI Responses, Anthropic, Gemini, DeepSeek, and
TogetherAI — 16 fresh recordings, all assertions pass. No existing
cassettes were modified; these populate test slots that were previously
skipped in replay mode.

Recorded via:
    set -a; source .env.recorded.local; set +a
    RECORD=true bun test test/provider/*.recorded.test.ts

Redactor stripped all auth headers; no secrets in the cassettes.
2026-05-10 22:03:02 -04:00
Kit Langton
ab9b79ef88 refactor(llm): rename Usage.native to providerMetadata
Aligns the escape-hatch field name with `LLMEvent.providerMetadata` used
elsewhere in this package (and with AI SDK / pydantic-ai / LangChain
conventions for the same idea). Two parallel escape hatches having
different names was a wart.

The raw payload is now wrapped under the provider key — `{ openai: ... }`,
`{ anthropic: ... }`, `{ google: ... }`, `{ bedrock: ... }` — using the
existing `ProviderMetadata = Record<string, Record<string, unknown>>`
schema rather than a flat record. Same shape as
`LLMEvent.providerMetadata`, so consumers downstream can read both with
the same code.

Anthropic's `mergeUsage` merges the per-provider sub-record across
`message_start` and `message_delta` instead of spreading at the top level.
2026-05-10 21:42:09 -04:00
Kit Langton
d4ff331052 refactor(llm): inclusive total + non-overlapping breakdown for Usage
Final shape after considering ecosystem conventions:

  inputTokens             — inclusive total (matches AI SDK / OpenAI / LangChain)
  outputTokens            — inclusive total (includes reasoning)
  nonCachedInputTokens    — breakdown: fresh prompt
  cacheReadInputTokens    — breakdown: cache hit
  cacheWriteInputTokens   — breakdown: cache write
  reasoningTokens         — subset of outputTokens

Invariant:
  nonCached + cacheRead + cacheWrite = inputTokens
  reasoningTokens <= outputTokens

Why this shape:

- `inputTokens` keeps its AI-SDK / OpenAI semantics, so a reader from any
  major ecosystem sees the number they expect.
- The non-overlapping breakdown fields are populated alongside the
  inclusive totals — consumers read whichever they need without
  subtracting. This eliminates the underflow bug class (opencode#26620)
  structurally without diverging on naming.
- Aligns with the AI SDK v3 spec proposal (vercel/ai#9921), which adds
  exactly this kind of non-overlapping breakdown to address the active
  ecosystem bugs around cache token double-counting and underflow
  (pydantic-ai#4364, langfuse#12306/#11979, vercel/ai#8349,
  langchain#32818, langchainjs#10249).

Mappers:

- OpenAI Chat / Responses / Bedrock: provider reports inclusive totals
  natively; mapper derives `nonCachedInputTokens` via
  `ProviderShared.subtractTokens`.
- Gemini: `promptTokenCount` is inclusive; `candidatesTokenCount` is
  *exclusive* of `thoughtsTokenCount`, so mapper sums those to produce
  the inclusive `outputTokens`. Only computes the total when the visible
  component is reported (avoids fabricating an inclusive number from a
  partial breakdown).
- Anthropic: `input_tokens` is *non-cached* natively; mapper sums it with
  cache reads/writes to produce the inclusive `inputTokens`.
  `output_tokens` is inclusive (Anthropic doesn't break thinking out, so
  `reasoningTokens` stays undefined).

Added a `visibleOutputTokens` getter (clamped `outputTokens - reasoningTokens`)
as the one safe escape hatch for consumers wanting the non-reasoning view.

Added `ProviderShared.sumTokens` to derive an inclusive total from a
non-overlapping breakdown, returning `undefined` when every input is
undefined (so we don't fabricate a 0).
2026-05-10 20:39:22 -04:00
Kit Langton
f5d199db62 feat(llm): add Usage.totalInputTokens / totalOutputTokens getters
Match the `LLMResponse.text` / `reasoning` / `toolCalls` getter pattern
in the same file — `usage.totalInputTokens` reads naturally and lives
where the Usage data does. Both sums are monotonic under the additive
contract, so callers no longer need to remember which fields are
non-overlapping.

Test fixtures that previously asserted with `usage: { ... }` plain
literals are now wrapped with `new Usage({...})` to match the runtime
shape the mappers actually produce (an instance, not a struct).
2026-05-10 19:29:41 -04:00
Kit Langton
0d4f8d126f refactor(llm): drop Usage.totalInput / totalOutput helpers
The additive contract delivers value at the mapper boundary — every
field is non-overlapping and non-negative, so any caller summing
arbitrary subsets is correct by construction. Two-line helpers that
just sum three or two known fields add API surface without paying for
themselves, and there are no in-tree consumers today. If v2 wants them
at integration time, the right place is a getter on the `Schema.Class`
(matching the `LLMResponse.text` / `reasoning` / `toolCalls` pattern in
the same file), not a static namespace helper.
2026-05-10 19:15:46 -04:00
Kit Langton
478f3ae50c refactor(llm): trim Usage helpers + Bedrock subtraction
Review pass:
- Drop `Pick<>` type aliases on `Usage.totalInput` / `Usage.totalOutput`
  — the helpers can take `Usage` directly since every field is optional.
- Collapse Bedrock's nested `subtractTokens(subtractTokens(...))` into a
  single subtraction against the summed cache subtotals.
- Drop arithmetic-walkthrough comments in test fixtures (the raw
  fixture values are right next to the expected outputs).
- Generalize the comment on `mapUsage` in `openai-chat.ts` so the
  rationale outlives the PR reference.
2026-05-10 13:22:49 -04:00
Kit Langton
b9451175a6 refactor(llm): make LLM.Usage a fully-additive contract
Defines a single invariant for `LLM.Usage`: every field is non-negative
and every meaningful aggregate is a *sum*, never a difference. Total
billable input = inputTokens + cacheReadInputTokens + cacheWriteInputTokens.
Total billable output = outputTokens + reasoningTokens. Adding two
non-negatives cannot underflow, so consumers can no longer reproduce the
underflow-then-clamp bug class fixed by #26620.

Each protocol mapper now enforces the contract at the provider boundary
via `ProviderShared.subtractTokens`, which clamps with `Math.max(0, …)`
for defense against provider bugs:

- OpenAI Chat / Responses: pull `cached_tokens` out of `prompt_tokens` /
  `input_tokens`; pull `reasoning_tokens` out of `completion_tokens` /
  `output_tokens`. The provider's `total_tokens` is preserved verbatim.
- Gemini: pull `cachedContentTokenCount` out of `promptTokenCount`.
  Gemini already split visible candidates from thoughts.
- Bedrock: pull `cacheReadInputTokens` and `cacheWriteInputTokens` out of
  `inputTokens`, matching AWS prompt-caching docs.
- Anthropic: already non-overlapping per the Messages API; pass through.

Adds `Usage.totalInput` / `Usage.totalOutput` helpers for callers that
want the merged view, and a regression test covering the clamp behavior.

The reasoning underflow fixed in #26620 was the most visible symptom of
a broader semantic inconsistency in this package: providers also disagreed
on whether `inputTokens` includes cache reads (Anthropic excluded;
OpenAI/Gemini/Bedrock included), which would silently double-subtract
the moment v2 wired LLM.Usage into Session.getUsage. Normalizing now,
pre-integration, closes both holes in one move.
2026-05-10 13:07:58 -04:00
Kit Langton
9c8da69196 Use Effect timeout in compaction test (#26728) 2026-05-10 12:45:54 -04:00
opencode-agent[bot]
a78018697c chore: generate 2026-05-10 16:44:40 +00:00
Kit Langton
e45b6ef1de refactor(http-recorder): use Schema.TaggedErrorClass for cassette errors (#26729) 2026-05-10 16:43:33 +00:00
opencode-agent[bot]
b616543ac2 chore: generate 2026-05-10 16:30:55 +00:00
Kit Langton
2bd3d9a696 refactor(http-recorder): hide cassette format behind Cassette seam (#26725) 2026-05-10 12:29:55 -04:00
Kit Langton
fa15dbc5ec Migrate compaction process tests (#26723) 2026-05-10 12:25:44 -04:00
opencode-agent[bot]
312e5c7a7c chore: generate 2026-05-10 16:22:29 +00:00
Kit Langton
049502fac6 fix(server): return diagnosable body for schema rejections (#26631) 2026-05-10 16:21:32 +00:00
opencode-agent[bot]
cc2915be16 chore: generate 2026-05-10 16:20:16 +00:00
Kit Langton
ce061bf661 Add explicit LLM stream lifecycle events (#26722) 2026-05-10 12:19:13 -04:00
Frank
3b8790e034 zen: fix usage css on mobile 2026-05-10 12:14:11 -04:00
Kit Langton
a4f3cedcdf Start effect-style compaction tests 2026-05-10 16:12:00 +00:00
opencode-agent[bot]
1c9a2eb239 chore: generate 2026-05-10 16:06:18 +00:00
Kit Langton
4fb417d3b5 feat(http-recorder): default mode to "auto" (#26719) 2026-05-10 16:05:11 +00:00
Kit Langton
11030c627b Scope boolean query overrides 2026-05-10 11:57:52 -04:00
opencode-agent[bot]
c104098a66 chore: generate 2026-05-10 15:55:49 +00:00
Kit Langton
49ee3ba85a Source diff message query pattern (#26638) 2026-05-10 11:54:54 -04:00
opencode-agent[bot]
4fc538378d chore: generate 2026-05-10 14:50:21 +00:00
Kit Langton
d28b5ad2f4 refactor(http-recorder): Redactor + Recorder seams, README (#26636) 2026-05-10 10:49:22 -04:00
opencode-agent[bot]
6589a66822 chore: generate 2026-05-10 12:28:11 +00:00
Shoubhit Dash
5cf9abe743 feat(scout): materialize configured reference repos (#26692) 2026-05-10 17:57:11 +05:30
Frank
903d81819d Zen: add Ring 2.6 1T 2026-05-10 03:51:34 -04:00
opencode-agent[bot]
472f9e64a6 chore: update nix node_modules hashes 2026-05-10 07:06:30 +00:00
Frank
c04fa9e253 sync: revert
This reverts commit 3a7f617098.
2026-05-10 02:58:46 -04:00
opencode-agent[bot]
3a78fb1f42 chore: generate 2026-05-10 06:49:21 +00:00
Aiden Cline
85ce6a5f95 feat: better image handling (auto resize & max size constraints) (#26401) 2026-05-10 01:48:19 -05:00
opencode-agent[bot]
5217e6c1af chore: generate 2026-05-10 06:39:09 +00:00
Frank
3a7f617098 go: add tencent icon 2026-05-10 02:37:50 -04:00
opencode-agent[bot]
d9150413cb chore: generate 2026-05-10 06:24:35 +00:00
Jack
bcbc1dba22 Go add hy3 preview (#26533) 2026-05-10 02:23:34 -04:00
Frank
ce3235e115 sync 2026-05-10 02:17:32 -04:00
opencode-agent[bot]
a9a2a597d5 chore: generate 2026-05-10 04:30:04 +00:00
Dax
3753601f87 Format TUI paths relative to session directory (#26648) 2026-05-10 04:29:02 +00:00
Kit Langton
fb4bab8a66 Remove redundant ID Zod overrides (#26633) 2026-05-09 23:12:21 -04:00
opencode-agent[bot]
b3526f6ce9 chore: generate 2026-05-10 03:03:37 +00:00
Kit Langton
f220f02a2f Source workspace path pattern (#26632) 2026-05-09 23:02:31 -04:00
opencode-agent[bot]
235a86fb60 chore: generate 2026-05-10 02:59:46 +00:00
Kit Langton
67b9c9c027 Source HTTP API ID path patterns (#26623) 2026-05-09 22:58:47 -04:00
opencode
2f11c9f7ed sync release versions for v1.14.46 2026-05-10 02:34:36 +00:00
opencode-agent[bot]
e1c1193f3e chore: generate 2026-05-10 02:11:45 +00:00
Kit Langton
29250a0efb fix(session): loosen remaining stored numeric schemas to tolerate legacy data (#26622) 2026-05-09 22:10:48 -04:00
Kit Langton
c6e6bdf59f fix(session): tolerate negative token counts in stored parts (#26620) 2026-05-09 22:10:44 -04:00