The JSON schema (with additionalProperties: false) was missing 8 properties
present in the Zod config schema, causing IDE validation errors for valid
config keys: scheduling_mode, max_cache_first_wait_seconds,
failure_ttl_seconds, toast_scope, request_jitter_max_ms,
soft_quota_threshold_percent, quota_refresh_interval_minutes,
soft_quota_cache_ttl_minutes.
Also adds descriptions for quota_fallback, cli_first, and all new properties
to the build script so they survive future regenerations.
Changes:
- Upgraded to Zod v4 and adjusted schema generation for compatibility
- Fixed keep_thinking=true failing without debug mode (signature validation)
- Fixed tool calls failing for tools with no parameters (default args to {})
- Aligned auth headers with official Gemini CLI to reduce account issues
- Fixed quiet_mode not suppressing all toast notifications
Fixes 'Invalid signature in thinking block' error when switching models mid-session.
Root cause: Gemini stores thoughtSignature in metadata.google on tool call parts,
but existing strippers only checked top-level signatures. When switching to Claude
with a tool call, the foreign signature caused validation errors.
Changes:
- Add cross-model-sanitizer module for bi-directional sanitization (Gemini<->Claude)
- Integrate sanitizer into request pipeline for Claude models
- Add 42 new tests (28 unit + 14 integration)
- Add E2E test scripts for 5-model verification
Tested with: Gemini, Claude (Anthropic), Claude (Google), OpenAI, all passing.
- Deleted the RATE_LIMIT_ROUTING_ANALYSIS.md document as it is no longer needed.
- Enhanced the regression test to provide detailed failure information, including the first failure's stderr output.
- Updated the plugin to handle 400 errors ("Prompt too long") with a synthetic response instead of returning a session-locking error.
- Introduced createSyntheticErrorResponse function to generate a synthetic SSE response for error messages, allowing continued session usage.
- Added tests for createSyntheticErrorResponse to ensure correct behavior and structure of the synthetic SSE events.
- Add 2s deduplication window to prevent rate limit counter inflation from concurrent 429s
- Separate cooldown system from rate limits for non-429 errors (auth failures, 5xx)
- Add quota_fallback config option for automatic quota switching on rate limit
- Add toast notification for 400 'Prompt is too long' errors guiding users to /compact
- Add 5 new cooldown unit tests
- Enhance regression test suite with concurrent test infrastructure
- Add comprehensive rate limit analysis documentation
- Add test-models.ts for validating all supported model endpoints
- Add test-regression.ts for multi-turn regression testing (Issue #50)
- Consolidate Gemini 3 Flash variants (low/medium/high) into single model
- Fix schema structure by flattening nested signature_cache properties
- Extract streaming transformer utilities to dedicated module