Files
BrowserOS/apps/server
shivammittal274 3808faf94d fix: robust compaction with Pi-style token counting + overflow middle… (#444)
* fix: robust compaction with Pi-style token counting + overflow middleware

Root cause: getCurrentTokenCount() returned stale inputTokens from the
previous step, ignoring new tool results added to messages since that
step. A large tool output (DOM snapshot, page content) caused a token
jump that bypassed the compaction threshold check, leading to
context_length_exceeded errors (322K tokens sent, model max 262K).

Layer 1 — Accurate token counting (proactive):
- Adopt Pi coding agent's additive approach: base(inputTokens) +
  outputTokens + estimate(trailing tool results)
- Trailing tool results are estimated by walking backwards from end of
  messages array until a non-tool message is found
- Falls back to full estimation with safety multiplier when no real
  usage data is available (first step of a turn)

Layer 2 — Context overflow middleware (reactive):
- LanguageModelV3Middleware that wraps doGenerate/doStream
- Catches context_length_exceeded errors at the model call level
- Truncates prompt (keeps system messages + most recent non-system
  messages targeting 60% of context window)
- Retries the model call once

Verified end-to-end with real model (Gemini Flash Lite via OpenRouter)
on 16K context window: 4 compactions triggered correctly across 8
steps, no context_length_exceeded errors.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: adopt Pi-style overflow detection patterns + fix truncation edge case

- Replace 6 generic substring matches with 17 provider-specific regex
  patterns from Pi coding agent (Anthropic, OpenAI, Google, xAI, Groq,
  OpenRouter, Bedrock, Copilot, llama.cpp, LM Studio, MiniMax, Kimi,
  Mistral, z.ai)
- Fix truncatePrompt edge case: when the last message alone exceeds the
  target, keepFrom was never updated → empty non-system messages. Now
  always keeps at least the most recent non-system message.
- Add runtime guard for LanguageModelV3 cast in ai-sdk-agent.ts
- Add tests for false-positive rejection and truncation edge case

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-09 14:22:35 +05:30
..
2026-01-14 21:30:17 +05:30
2026-02-23 07:28:45 -08:00