Commit Graph

42 Commits

Author SHA1 Message Date
tctinh
3445cdaa88 feat: progressive rate limit retry with switch_on_first_rate_limit config
- Replace SHORT_RETRY_THRESHOLD_MS with progressive retry logic
- First 429: wait 1s, then switch account (if switch_on_first_rate_limit=true)
- Second 429: wait 5s, then switch (if switch_on_first_rate_limit=false)
- Single account: exponential backoff (1s, 2s, 4s... max 60s)
- Add switch_on_first_rate_limit config option (default: true)
- Update README with new config documentation

Fixes NoeFabris/opencode-antigravity-auth#147
2026-01-08 22:36:13 +07:00
tctinh
815068ff05 feat: add pid_offset_enabled config option (disabled by default)
- Add pid_offset_enabled config option (default: false)
- Add env var override OPENCODE_ANTIGRAVITY_PID_OFFSET_ENABLED
- Make PID-based account offset opt-in instead of always-on
- Update README with documentation

Fixes #122
2026-01-06 09:45:46 +07:00
tctinh
eedafe5947 feat: smart account routing with error differentiation
- Add account_selection_strategy config: sticky, round-robin, hybrid
- Fix capacity vs quota error handling (Issue #111)
- Capacity errors: escalating backoff (5s→60s), retry same account
- Quota errors: switch to next account immediately
- Add touchedForQuota state tracking for hybrid strategy
- Add comprehensive tests for all strategies
2026-01-05 22:53:22 +07:00
Muhammad Septian
36f2f549bd fix: improve account management by ensuring current auth is added to stored accounts 2026-01-02 02:37:01 +00:00
tctinh
902b5bf91a fix: correct output redirection in E2E test scripts and update model testing options 2026-01-01 09:55:28 +07:00
Cris R.
9ae70372e2 feat: implement model-specific gemini quota with prioritized antigravity pool fallback 2025-12-30 19:21:02 +01:00
tctinh
5025723186 fix: resolve OAuth callback hanging in WSL/SSH/remote environments
- Fix IPv4/IPv6 mismatch by binding server to all interfaces
- Add WSL/SSH/remote environment detection to skip unreachable local server
- Add 30s timeout fallback with manual URL input prompt
- Add --no-browser flag support for headless environments
- Add fetch timeout (10s) to fetchProjectID() to prevent indefinite hangs
- Improve openBrowser() with WSL wslview support
2025-12-29 23:20:38 +07:00
tctinh
802dfe1fdb Refactor rate limit handling and improve error responses
- Deleted the RATE_LIMIT_ROUTING_ANALYSIS.md document as it is no longer needed.
- Enhanced the regression test to provide detailed failure information, including the first failure's stderr output.
- Updated the plugin to handle 400 errors ("Prompt too long") with a synthetic response instead of returning a session-locking error.
- Introduced createSyntheticErrorResponse function to generate a synthetic SSE response for error messages, allowing continued session usage.
- Added tests for createSyntheticErrorResponse to ensure correct behavior and structure of the synthetic SSE events.
2025-12-28 18:28:56 +07:00
tctinh
457b3ac12b fix: improve rate limit handling and add prompt-too-long toast
- Add 2s deduplication window to prevent rate limit counter inflation from concurrent 429s
- Separate cooldown system from rate limits for non-429 errors (auth failures, 5xx)
- Add quota_fallback config option for automatic quota switching on rate limit
- Add toast notification for 400 'Prompt is too long' errors guiding users to /compact
- Add 5 new cooldown unit tests
- Enhance regression test suite with concurrent test infrastructure
- Add comprehensive rate limit analysis documentation
2025-12-28 16:09:52 +07:00
tctinh
7f069c8a96 Merge remote-tracking branch 'origin/dev' into feature/gemini-cli-routing 2025-12-28 00:36:30 +07:00
Tinh To
a658937bc0 Merge branch 'dev' into fix/account-duplication 2025-12-28 00:09:22 +07:00
tctinh
14f9067089 feat: add Claude tool hardening and improve context error recovery
- Add tool hardening for Claude models with parameter signature injection
  and system instruction prepending (configurable via claude_tool_hardening)
- Add context error detection (prompt_too_long, tool_pairing) with toast
  notifications to guide users on recovery actions
- Improve session recovery to handle cases where messageID isn't provided
  by fetching and finding the latest assistant message
- Change empty schema placeholder from reason (string) to _placeholder
  (boolean) to reduce token usage
- Add duplicate injection prevention for parameter signatures and tool
  hardening instructions
- Fix cache key to strip tier suffix from model name (e.g., -high, -low)
  preventing cache misses on tier change
- Add thoughtsTokenCount to usage metadata extraction
- Extract and export applyToolPairingFixes helper for centralized tool
  pairing logic
- Add comprehensive tests for recovery error detection and request helpers
2025-12-27 16:43:07 +07:00
tctinh
7bef0105ef Merge remote-tracking branch 'origin/dev' into feature/gemini-cli-routing 2025-12-27 11:21:11 +07:00
CasualDeveloper
e879ad9594 fix: fail fast when rate-limit wait exceeds configurable threshold (#59)
- Add max_rate_limit_wait_seconds config option (default: 300s / 5 min)
- Set to 0 to disable fail-fast and wait indefinitely
- Shows clear error message with quota reset time
- Suggests adding more accounts or waiting
2025-12-26 23:26:29 +08:00
tctinh
53e21a7264 Merge remote-tracking branch 'origin/dev' into feature/gemini-cli-routing 2025-12-26 21:40:59 +07:00
CasualDeveloper
3a5a3fa50a fix: prevent infinite loop, normalize empty tool schemas, and retry thinking recovery
- Add guards when accountCount <= 1 to prevent infinite while(true) loop
- Force type: 'object' and inject placeholder property for empty Claude schemas
- Fix THINKING_RECOVERY_NEEDED to actually retry with closeToolLoopForThinking()
  instead of returning a useless 'send continue' error message

The thinking recovery now:
1. Detects API error indicating thinking_block_order issue
2. Sets forceThinkingRecovery flag and restarts endpoint loop
3. On retry, closeToolLoopForThinking() strips all thinking and injects
   synthetic messages to start a fresh turn
4. Only retries once to avoid infinite loops
2025-12-26 18:48:17 +08:00
tctinh
7c43511f7a feat: add prefix-based quota routing for Gemini CLI vs Antigravity
- Add 'antigravity-' prefix to route to Antigravity quota
- Models without prefix route to Gemini CLI quota
- Claude/GPT models auto-route to Antigravity (only available there)
- Gemini 3 tiers: Pro supports low/high, Flash supports minimal/low/medium/high
- Add thinkingLevel param for Gemini CLI, keep tier in model name for Antigravity

Resolves #51
2025-12-26 17:36:20 +07:00
tctinh
10fb25ad8c Merge origin/dev into feature/fix-tool-pairing
Resolves import conflicts in request.ts - keeps both transform imports
and detectErrorType import from recovery module.
2025-12-26 12:12:31 +07:00
tctinh
f145c7e5cb fix: add inline recovery for thinking_block_order errors
- Export detectErrorType() and isRecoverableError() from recovery.ts
- Import detectErrorType in request.ts
- Detect thinking_block_order errors in transformAntigravityResponse
- Throw THINKING_RECOVERY_NEEDED to trigger recovery in fetch wrapper
- Return synthetic error response with recovery instructions

The session.error hook wasn't being triggered for API 400 errors.
This fix catches recoverable errors inline and returns a user-friendly
message instructing them to send 'continue' to resume.
2025-12-26 12:07:45 +07:00
tctinh
ab86d6d8d1 fix: prevent tool_use without tool_result errors
Add defense-in-depth protection against Claude API tool pairing errors:

- findOrphanedToolUseIds(): Detect orphaned tool_use blocks
- fixClaudeToolPairing(): Inject placeholder tool_result responses
- validateAndFixClaudeToolPairing(): Nuclear fallback removes broken blocks
- 12 new unit tests covering all edge cases

Fixes permanent session corruption when ESC pressed or context compacted.
2025-12-26 11:44:06 +07:00
tctinh
6c1e3b7b40 feat: Implement session recovery module for handling recoverable errors
- Added recovery functionality for tool_result_missing, thinking_block_order, and thinking_disabled_violation errors.
- Introduced constants and types for session recovery.
- Created storage utilities for reading and writing session data.
- Enhanced debug logging capabilities in debug.ts.
- Refactored debug state management for better initialization and access.
2025-12-26 02:09:15 +07:00
tctinh
bd9e6ef1bc fix: add thinking recovery message for toast notifications in plugin 2025-12-26 01:35:07 +07:00
tctinh
5d52ed0fa4 Add Antigravity schema cleaner and warmup fixes
Port CleanJSONSchemaForAntigravity from CLIProxyAPI to convert
unsupported JSON Schema features into description hints and remove
unsupported keywords. Add warmup tracking improvements: retry limit,
success marking, and cleanup when evicting old entries.
2025-12-24 12:25:47 +07:00
Noe
abba87ad12 PR Fixes 2025-12-22 23:24:20 +01:00
Noe
e382fcd644 Improves Gemini quota handling with dual pools
Enhances Gemini quota management by introducing dual quota pools (Antigravity and Gemini CLI) for increased quota availability.

Gemini models now automatically fall back to the second quota pool when the first is exhausted, effectively doubling the quota per account.
2025-12-22 23:07:57 +01:00
Ariane Emory
5ab0a1b23c Merge upstream/main into fix/account-duplication
- Preserve @ariane-emory scoped package name
- Update version to 1.2.1-fix-account-duplication
- Integrate deduplication logic with upstream v1->v2 migration
- Include all upstream features: auto-update checker, enhanced logging, multi-account improvements
2025-12-21 16:48:00 -05:00
Noe
2a4909bdae PR fixes 2025-12-21 13:13:05 +01:00
Noe
a5f2ff440c feat: implement auto-update checker for Antigravity plugin with installation instructions and model updates 2025-12-21 03:09:16 +01:00
Noe
684076eaba Improves Antigravity account management and debugging
Refactors account management to support sticky sessions, per-model-family rate limits, and enhanced debugging capabilities.

- Implements sticky account selection, preserving Anthropic's prompt cache by sticking to the same account until a rate limit is encountered.
- Tracks rate limits separately for Claude and Gemini models, allowing an account to be used for one model family even if rate-limited for another.
- Introduces a smart retry threshold for short rate limits, retrying on the same account to avoid unnecessary switching.
- Adds exponential backoff for consecutive rate limits, increasing delays with each subsequent 429 error.
- Includes quota reset times in rate limit toast notifications when available from the API.
- Debounces toast notifications to prevent spam during streaming responses.
- Introduces a quiet mode to suppress account-related toast notifications via an environment variable.
- Enhances debug logging with level-based verbosity, TUI integration, and auto-stripping of injected debug blocks.
2025-12-21 02:25:44 +01:00
Ariane Emory
d63478bb57 Merge NoeFabris/main into fix/account-duplication 2025-12-18 23:54:17 -05:00
Noè
86533c104e Merge branch 'main' into claude-improvements 2025-12-18 10:37:37 +00:00
Ariane Emory
02e8a2b79e Don't reset rate limits when refreshing token
Rate limits are tied to the account/project, not the token.
Refreshing a token doesn't reset rate limit state.
2025-12-17 16:07:23 -05:00
Ariane Emory
584d9c4f1e Address CodeRabbit review feedback and merge upstream/main
- Fix integer overflow risk in score calculation (compare fields separately)
- Condense verbose test comments
- Add docstrings to storage.ts for improved documentation coverage
- Merge upstream/main to get toast fix for single-account users
- Update version to 1.1.7-fix-account-duplication
2025-12-17 15:35:37 -05:00
Ariane Emory
42c81b064d Fix account count messages to show actual deduplicated count
Update CLI prompts and completion message to read the actual account
count from storage after deduplication, instead of showing how many
OAuth exchanges were completed (which could include duplicates).
2025-12-17 14:47:18 -05:00
Ariane Emory
23cf7a7af3 Fix account duplication issue #24 - deduplicate by email
- Add email-based deduplication in persistAccountPool to prevent duplicates when refresh token changes
- Add deduplicateAccountsByEmail function to clean up existing duplicate accounts on load
- Add comprehensive unit tests for deduplication logic including exact scenario from issue #24
- When same email re-authenticates, replace existing entry instead of creating duplicate
- Preserve newest account (by lastUsed, then addedAt) for each email address
2025-12-17 14:22:55 -05:00
fgonzalezurriola
6d78ea5d91 refactor: Using {@email} toast doesn't show for users with less than 2 accounts 2025-12-17 14:37:20 -03:00
Noe
314ac9d427 feat(claude): add multi-turn thinking signature caching and real-time SSE streaming
Implement comprehensive support for Claude thinking models with interleaved
thinking in multi-turn conversations:

- Add signature caching system to preserve and restore thinking block
  signatures across conversation turns, preventing "invalid signature" errors
- Enable real-time SSE streaming with immediate forwarding of thinking tokens
- Add interleaved-thinking-2025-05-14 beta header for Claude thinking models
- Implement smart system hints to encourage thinking during tool use
- Add VALIDATED mode for tool calling on Claude models
- Ensure output token limits accommodate thinking budgets
- Filter and sanitize thinking blocks, removing SDK-injected cache_control
- Add comprehensive test suites for auth, cache, and request-helpers modules
- Update build config to exclude test files from production builds
- Document streaming and thinking features in README
2025-12-17 15:52:40 +00:00
Noe
debcfb7443 fix: add modalities support for input/output in configuration and improve toast notifications in plugin 2025-12-17 00:18:21 +00:00
Noe
c42a90d645 feat(auth): add login mode selection and improve rate limit handling
- Add CLI prompt to choose between adding accounts or starting fresh
- Implement automatic retry with backoff for single-account rate limits
- Show toast notifications for account switching and rate limit status
- Clear stale account storage when OpenCode auth state changes
- Add sleep helper function with abort signal support
- Improve README with clearer step-by-step setup instructions

TUI flow now adds accounts non-destructively; CLI flow offers choice.
2025-12-17 00:09:17 +00:00
Noe
2052e4d580 Add multi-account load balancing and improved OAuth UX
Adds multi-account support and round-robin load balancing for Google Antigravity OAuth to increase request throughput and resilience. Introduces an on-disk account pool with cooldowns for rate-limited accounts, automatic removal of revoked refresh tokens, and persistence of rotation state.

Improves OAuth flows and UX: CLI flow can add multiple accounts with per-account project IDs, TUI flow remains single-account, improved browser opening/fallback copy-paste handling, and clearer prompts for pasting redirect URLs or codes. Adds robust parsing of callback input and better headless handling.

Makes token refresh handling explicit and typed (throws a specific error on invalid_grant) and centralizes account management logic into an in-memory manager with persistence utilities. Adds tests for account rotation and rate-limit behavior and bumps package version.

Overall, this increases reliability under rate limits, makes multi-account configuration straightforward, and improves error handling and developer/user experience.
2025-12-16 01:35:02 +00:00
Noe
d946805235 Updates to make it work with workspace accounts 2025-12-10 11:33:03 +00:00
Noe
5d229bf44e First commit - auth and models working 2025-12-09 23:59:18 +00:00