- Add safety check in stripAllThinkingBlocks() to inject placeholder when all content stripped
- Enhance debug logging: filter tool summary to only show problematic tools (hasSchema=n)
- Add content stats tracking (messages, text, tool_use, thinking, empty) to error debug info
- Add regression tests for the empty content and debug formatting scenarios
Fixes 400 'Request contains an invalid argument' errors that occurred when thinking
block stripping removed all content from a message.
Prevent Google from penalizing accounts by stopping usage before quota
is fully exhausted. Uses same wait/retry logic as rate limits.
Features:
- soft_quota_threshold_percent (default 90%): Skip account when usage
exceeds this percentage
- quota_refresh_interval_minutes (default 15): Background refresh after
successful API requests
- soft_quota_cache_ttl_minutes (default 'auto'): Cache freshness TTL,
auto = max(2 * refresh_interval, 10)
Behavior:
- Accounts over threshold are skipped during selection
- When all accounts over threshold: wait for earliest reset time
- If wait exceeds max_rate_limit_wait_seconds: error immediately
- Stale/missing cache: fail-open (allow account)
- threshold=100 disables protection
Matches rate limit behavior for consistency.
- Enhanced issue template configuration with troubleshooting and existing issues links.
- Removed old feature request markdown and replaced it with a new YAML template for better structure.
- Updated README to include a comprehensive troubleshooting section and clarified configuration paths.
- Added detailed troubleshooting steps for common issues, including rate limits and OAuth callback problems.
- Incremented version to 1.2.9-beta.10 in package.json.
- Created CONFIGURATION.md for detailed configuration options and examples.
- Added MODEL-VARIANTS.md to explain the variant system and its usage.
- Introduced MULTI-ACCOUNT.md to guide users on setting up and managing multiple accounts.
- Developed TROUBLESHOOTING.md to address common issues and provide solutions.
- Add Windows PowerShell/Command Prompt equivalents for port discovery
- Add hint about trying common ports (8080, 3000, 5000)
- Add alternative lsof command to list all listening processes
Co-authored-by: Copilot <copilot@github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
- Wrap tools in functionDeclarations format for Gemini 3 API
- Flatten incoming functionDeclarations and convert parameters to proper schema
- Re-implement session-level thinking deduplication after partial revert
- Update README model names for clarity (Antigravity vs Gemini CLI)
- Fix model resolver to append default tier for Antigravity Gemini 3 models
Merge conflict resolution and defaults per official API docs.
**Gemini 3** (https://cloud.google.com/vertex-ai/generative-ai/docs/thinking):
> HIGH: Allows the model to use more tokens for thinking... This is the
> default level for Gemini 3 Pro and Gemini 3 Flash.
- Default thinkingLevel: 'high' for both Pro and Flash
- Model-specific levels: Flash (minimal/low/medium/high), Pro (low/high only)
**Claude** (https://docs.anthropic.com/en/docs/build-with-claude/extended-thinking):
> To turn on extended thinking, add a thinking object... and the
> budget_tokens to a specified token budget.
- budget_tokens is required when enabling extended thinking (no default)
- Default thinkingBudget: 32768 (max) for Claude thinking models without variant
**Additional changes per maintainer feedback:**
- Simplify Claude variants to 'low' and 'max' only (medium/high not much different)
- Update all README examples and config snippets
- Add 'minimal' to GEMINI_3_THINKING_LEVELS constant
- Change Pro default from 'medium' to 'high' (per Google API docs)
- Document model-specific level availability:
- Flash: minimal, low, medium, high
- Pro: low, high only
- Update README variant examples with correct levels per model
Addresses CodeRabbit review feedback on #131
Fixes#130
## Changes
### Model resolver (already in PR)
- Default thinkingLevel for base Gemini 3 models: Pro → 'medium', Flash → 'minimal'
### Native thinkingLevel support
- extractVariantThinkingConfig now extracts thinkingLevel string for Gemini 3
- Prefer native thinkingLevel when present
- Fall back to budget→level conversion with deprecation warning
### Correct variant config format (README)
- Remove providerOptions.google wrapper from all examples
- Gemini 3: use thinkingLevel string
- Claude: use thinkingConfig.thinkingBudget number
### Code cleanup
- Remove dead Anthropic/OpenRouter checks (all Antigravity routes through Google)
- Add deprecation warning for legacy thinkingBudget on Gemini 3
## Backward Compatibility
- Legacy thinkingBudget for Gemini 3 still works (deprecated)
- Tier-suffixed model names still work