- Move rateLimitToastCooldowns to module level (persists across requests)
- Add allAccountsRateLimitedToastShown flag to show toast only once per event
- Add cleanup for toast cooldown map to prevent memory growth
Fixes#263
- Add enabled field to AccountMetadataV3 for account enable/disable
- Add getEnabledAccounts/getTotalAccountCount methods to AccountManager
- Add quota.ts with quota checking logic via fetchAvailableModels API
- Integrate 'Check quotas' and 'Manage accounts' options into auth menu
- Display quota status for Claude, Gemini 3 Pro, and Gemini 3 Flash
- Add standalone check-quota.mjs script for manual verification
- Update tsconfig to exclude temp_research folder
Prevents toast spam when multiple 429s occur in quick succession.
Same message type won't show again within 5 seconds.
Fixes issue #263 - 'Rate limited' toast appearing too frequently
Proposal 2: When max capacity retries (3) are exhausted, regenerate
the account fingerprint to get a fresh device identity before trying
the next endpoint. This may help bypass per-device throttling.
Changes:
- Upgraded to Zod v4 and adjusted schema generation for compatibility
- Fixed keep_thinking=true failing without debug mode (signature validation)
- Fixed tool calls failing for tools with no parameters (default args to {})
- Aligned auth headers with official Gemini CLI to reduce account issues
- Fixed quiet_mode not suppressing all toast notifications
Refactors how Claude thinking blocks are handled to improve reliability and address signature validation issues.
- Removes the deprecated `KEEP_THINKING_BLOCKS` constant and associated environment variable.
- Introduces runtime configuration for `keep_thinking` to control whether thinking blocks are preserved.
- Implements logic to inject a sentinel value (`SKIP_THOUGHT_SIGNATURE`) when a thinking block's signature is invalid or missing, allowing the API to bypass validation. This addresses scenarios like cache misses, session mismatches, and plugin restarts.
This change enhances the robustness of Claude integrations by providing a mechanism to handle signature issues gracefully.
Add two new configuration options to allow users to tune rate limit behavior:
- default_retry_after_seconds: Default retry delay when API doesn't return
a retry-after header (default: 60s, configurable 1-300s)
- max_backoff_seconds: Maximum cap for exponential backoff delay
(default: 60s, configurable 5-300s)
These options help users who want faster retries at the cost of potentially
hitting more 429 errors, or users who prefer longer delays to preserve
prompt cache by staying on the same account.
The hardcoded 60-second values are now configurable while maintaining
backward compatibility with existing configurations.
- Added SERVICE_CAPACITY_EXHAUSTED rate limit reason type
- Updated parseRateLimitReason to detect "resource has been exhausted" without quotaResetTime as SERVICE_CAPACITY_EXHAUSTED
- Added progressive backoff tiers for service capacity exhaustion (5s, 10s, 20s, 30s, 60s)
- Integrated global capacity tracking with upstream/dev's rate limit system
- For SERVICE_CAPACITY_EXHAUSTED: uses global in-memory cooldown (not persisted)
- For other rate limits: uses upstream/dev's account-level tracking
- Logic: capacity exhausted agora força nova seleção de conta (break ao invés de continue)
- Style: separar função de leitura de mutação (recordAndGetCapacityBackoff + calculateCapacityBackoffDelay)
- Docs: adicionar JSDoc para funções de capacity (5 funções)
Problema: Quando o servidor retorna 429 com mensagem "Resource has been
exhausted" sem quotaResetTime, o plugin marca todas as contas como
rate-limited permanentemente no antigravity-accounts.json, mesmo sendo
um problema de capacidade temporária do backend.
Solução:
- Distinguir entre quota esgotada (persiste) e capacity exhausted (global)
- Capacity exhausted usa cooldown em memória por família/modelo (não persiste)
- Fluxo de quota real permanece inalterado (429 com quotaResetTime continua)
Arquivos:
- src/plugin.ts: adiciona capacidade global e helpers
- Fix double-prefix bug in resolveModelForHeaderStyle() where 'antigravity-gemini-3-flash' was being sent to API instead of 'gemini-3-flash'
- Add parseRateLimitReason() to classify rate limit errors (QUOTA_EXHAUSTED, RATE_LIMIT_EXCEEDED, MODEL_CAPACITY_EXHAUSTED, SERVER_ERROR)
- Add calculateBackoffMs() with different backoff strategies per reason type
- Add markRateLimitedWithReason() to track consecutive failures per account
- Add markRequestSuccess() to reset failure counter on success
- Add shouldTryOptimisticReset() and clearAllRateLimitsForFamily() for smarter rate limit recovery
- Integrate markRateLimitedWithReason in plugin.ts rate limit handling
- Add comprehensive tests for all new functions
Fixes#103