Commit Graph

6 Commits

Author SHA1 Message Date
tctinh
22a12154fb feat: add quota fallback option to configuration schema and update Gemini 3 thinking levels 2025-12-28 22:27:00 +07:00
tctinh
457b3ac12b fix: improve rate limit handling and add prompt-too-long toast
- Add 2s deduplication window to prevent rate limit counter inflation from concurrent 429s
- Separate cooldown system from rate limits for non-429 errors (auth failures, 5xx)
- Add quota_fallback config option for automatic quota switching on rate limit
- Add toast notification for 400 'Prompt is too long' errors guiding users to /compact
- Add 5 new cooldown unit tests
- Enhance regression test suite with concurrent test infrastructure
- Add comprehensive rate limit analysis documentation
2025-12-28 16:09:52 +07:00
tctinh
16f4bb07a1 feat: add E2E testing scripts and simplify Gemini Flash model config
- Add test-models.ts for validating all supported model endpoints
- Add test-regression.ts for multi-turn regression testing (Issue #50)
- Consolidate Gemini 3 Flash variants (low/medium/high) into single model
- Fix schema structure by flattening nested signature_cache properties
- Extract streaming transformer utilities to dedicated module
2025-12-28 00:04:57 +07:00
tctinh
14f9067089 feat: add Claude tool hardening and improve context error recovery
- Add tool hardening for Claude models with parameter signature injection
  and system instruction prepending (configurable via claude_tool_hardening)
- Add context error detection (prompt_too_long, tool_pairing) with toast
  notifications to guide users on recovery actions
- Improve session recovery to handle cases where messageID isn't provided
  by fetching and finding the latest assistant message
- Change empty schema placeholder from reason (string) to _placeholder
  (boolean) to reduce token usage
- Add duplicate injection prevention for parameter signatures and tool
  hardening instructions
- Fix cache key to strip tier suffix from model name (e.g., -high, -low)
  preventing cache misses on tier change
- Add thoughtsTokenCount to usage metadata extraction
- Extract and export applyToolPairingFixes helper for centralized tool
  pairing logic
- Add comprehensive tests for recovery error detection and request helpers
2025-12-27 16:43:07 +07:00
tctinh
7c43511f7a feat: add prefix-based quota routing for Gemini CLI vs Antigravity
- Add 'antigravity-' prefix to route to Antigravity quota
- Models without prefix route to Gemini CLI quota
- Claude/GPT models auto-route to Antigravity (only available there)
- Gemini 3 tiers: Pro supports low/high, Flash supports minimal/low/medium/high
- Add thinkingLevel param for Gemini CLI, keep tier in model name for Antigravity

Resolves #51
2025-12-26 17:36:20 +07:00
tctinh
ab86d6d8d1 fix: prevent tool_use without tool_result errors
Add defense-in-depth protection against Claude API tool pairing errors:

- findOrphanedToolUseIds(): Detect orphaned tool_use blocks
- fixClaudeToolPairing(): Inject placeholder tool_result responses
- validateAndFixClaudeToolPairing(): Nuclear fallback removes broken blocks
- 12 new unit tests covering all edge cases

Fixes permanent session corruption when ESC pressed or context compacted.
2025-12-26 11:44:06 +07:00