feat: isolate new-tab agent navigation from origin tab (#593)

* feat: isolate new-tab agent navigation from origin tab Add origin-aware navigation isolation so the agent never navigates away from the new-tab chat UI. This is a two-layer defense: 1. Prompt adaptation: When origin is 'newtab', the system prompt's execution and tool-selection sections are rewritten to prohibit navigating the active tab and default all lookups to new_page. 2. Tool-level guards: navigate_page and close_page reject attempts to act on the origin tab when in newtab mode, returning an error that teaches the agent to self-correct. The client now sends an `origin` field ('sidepanel' | 'newtab') instead of injecting a soft NEWTAB_SYSTEM_PROMPT that LLMs could ignore. Backwards compatible — defaults to 'sidepanel'. Closes TKT-592, addresses TKT-564 * test: add newtab origin navigation guard tests - 14 new prompt tests verifying the system prompt adapts correctly for newtab vs sidepanel origin (execution rules, tool selection table, absence of conflicting single-tab guidance) - 6 new integration tests for navigate_page and close_page guards: rejects origin tab in newtab mode, allows non-origin tabs, allows all tabs in sidepanel mode, backwards compatible with no session
2026-05-13 15:46:22 +00:00 · 2026-03-27 12:06:32 +05:30
parent b3003542d8
commit aacb47f7ee
11 changed files with 499 additions and 24 deletions
--- a/packages/browseros-agent/apps/agent/entrypoints/sidepanel/index/useChatSession.ts
+++ b/packages/browseros-agent/apps/agent/entrypoints/sidepanel/index/useChatSession.ts
@@ -76,8 +76,6 @@ export interface ChatSessionOptions {
  isIntegrationsSynced?: boolean
 }

-const NEWTAB_SYSTEM_PROMPT = `IMPORTANT: The user is chatting from the New Tab page. When performing browser actions, ALWAYS open content in a NEW TAB rather than navigating the current tab. The user's new tab page should remain accessible.`
-
 export const useChatSession = (options?: ChatSessionOptions) => {
  const {
    selectedLlmProviderRef,
@@ -344,12 +342,8 @@ export const useChatSession = (options?: ChatSessionOptions) => {
            reasoningEffort: provider?.reasoningEffort,
            reasoningSummary: provider?.reasoningSummary,
            browserContext,
-            userSystemPrompt:
-              options?.origin === 'newtab'
-                ? [personalizationRef.current, NEWTAB_SYSTEM_PROMPT]
-                    .filter(Boolean)
-                    .join('\n\n')
-                : personalizationRef.current,
+            origin: options?.origin ?? 'sidepanel',
+            userSystemPrompt: personalizationRef.current,
            userWorkingDir: workingDirRef.current,
            supportsImages: provider?.supportsImages,
            previousConversation,
--- a/packages/browseros-agent/apps/server/src/agent/ai-sdk-agent.ts
+++ b/packages/browseros-agent/apps/server/src/agent/ai-sdk-agent.ts
@@ -92,10 +92,15 @@ export class AiSdkAgent {
    }

    // Build browser tools from the unified tool registry
+    const originPageId = config.browserContext?.activeTab?.pageId
    const allBrowserTools = buildBrowserToolSet(
      config.registry,
      config.browser,
      config.resolvedConfig.workingDir,
+      {
+        origin: config.resolvedConfig.origin,
+        originPageId,
+      },
    )
    const browserTools = config.resolvedConfig.chatMode
      ? Object.fromEntries(
@@ -205,6 +210,7 @@ export class AiSdkAgent {
      connectedApps: config.browserContext?.enabledMcpServers,
      declinedApps: config.resolvedConfig.declinedApps,
      skillsCatalog,
+      origin: config.resolvedConfig.origin,
    })

    // Configure compaction for context window management
--- a/packages/browseros-agent/apps/server/src/agent/prompt.ts
+++ b/packages/browseros-agent/apps/server/src/agent/prompt.ts
@@ -193,19 +193,41 @@ You have persistent memory across sessions and an evolving personality. See the

 function getExecution(
  _exclude: Set<string>,
-  _options?: BuildSystemPromptOptions,
+  options?: BuildSystemPromptOptions,
 ): string {
-  return `<execution>
+  const isNewTab = options?.origin === 'newtab'
+
+  let executionContent = `<execution>
 ## Execution

 ### Philosophy
 - Execute tasks end-to-end. Don't delegate ("I found the button, you can click it").
 - Don't ask permission for routine steps. Act, then report.
 - Do not refuse by default, attempt tasks even when outcomes are uncertain.
- For ambiguous/unclear requests, ask one targeted clarifying question.
+- For ambiguous/unclear requests, ask one targeted clarifying question.`
+
+  if (isNewTab) {
+    executionContent += `
+
+### New-Tab Origin Rules
+You are operating from the user's **New Tab page**. The active tab (Page ID from Browser Context) is the chat UI itself.
+
+**CRITICAL RULES:**
+1. **NEVER call \`navigate_page\` on the active tab** — this would destroy the chat UI and navigate the user away.
+2. **NEVER call \`close_page\` on the active tab** — same reason.
+3. For ALL browsing tasks (including single-page lookups), use \`new_page\` (background) to open URLs.
+4. For single-page lookups, open a background tab, extract data, then close it.
+5. For multi-page research, open background tabs and group them with \`group_tabs\`.
+
+### Multi-tab workflow`
+  } else {
+    executionContent += `
 - Stay on the current page for single-page tasks. Use \`navigate_page\` to move within one tab.

-### Multi-tab workflow
+### Multi-tab workflow`
+  }
+
+  executionContent += `
 When a task requires working on multiple pages simultaneously:
 1. **Inform the user** that you're creating background tabs for the task.
 2. **Open new tabs in background** using \`new_page\` (opens in background by default) — never steal focus from the user's current tab.
@@ -216,15 +238,23 @@ When a task requires working on multiple pages simultaneously:
 7. **Never force-switch the user's active tab.** If you need user interaction on a background tab (e.g., login, CAPTCHA), tell the user which tab needs attention and let them switch manually.
 8. **Never navigate the user's current tab** during a multi-tab task. The current tab is the user's anchor — use it only for reading (snapshots, content extraction). All navigation should happen on background tabs.

-**Do NOT use \`create_hidden_window\` or \`new_hidden_page\` for user-requested tasks.** Hidden windows are invisible to the user and cannot be screenshotted. Use \`new_page\` (background mode) instead — tabs appear in the user's tab strip and can be inspected. Reserve hidden windows for automated/scheduled runs only.
+**Do NOT use \`create_hidden_window\` or \`new_hidden_page\` for user-requested tasks.** Hidden windows are invisible to the user and cannot be screenshotted. Use \`new_page\` (background mode) instead — tabs appear in the user's tab strip and can be inspected. Reserve hidden windows for automated/scheduled runs only.`

-For single-page lookups (e.g., "go to X and read Y"), use \`navigate_page\` on the current tab. Only create new tabs when the task requires multiple pages open simultaneously.
+  if (!isNewTab) {
+    executionContent += `
+
+For single-page lookups (e.g., "go to X and read Y"), use \`navigate_page\` on the current tab. Only create new tabs when the task requires multiple pages open simultaneously.`
+  }
+
+  executionContent += `

 ### Tab retry discipline
 When a background tab fails (404, wrong content, unexpected redirect):
 - **Navigate the existing tab** to the correct URL with \`navigate_page\` — do NOT open a new tab for retries.
 - If you must abandon a tab, close it with \`close_page\` before opening a replacement.
- Never let orphan tabs accumulate — each task should end with only the tabs that contain useful content.
+- Never let orphan tabs accumulate — each task should end with only the tabs that contain useful content.`
+
+  executionContent += `

 ### Observe → Act → Verify
 - **Before acting**: Take a snapshot to get interactive element IDs.
@@ -241,13 +271,38 @@ Some tools automatically include a fresh snapshot in their response (labeled "Ad
 - 2FA → notify user, pause for completion
 - Page not found (404) or server error (500) → report the error to the user
 </execution>`
+
+  return executionContent
 }

 // -----------------------------------------------------------------------------
 // section: tool-selection
 // -----------------------------------------------------------------------------

-function getToolSelection(): string {
+function getToolSelection(
+  _exclude: Set<string>,
+  options?: BuildSystemPromptOptions,
+): string {
+  const isNewTab = options?.origin === 'newtab'
+
+  const navTable = isNewTab
+    ? `### Navigation: single-tab vs multi-tab
+| Task | Approach |
+|------|----------|
+| Look up one page | \`new_page\` (background) → extract data → \`close_page\` |
+| Research across multiple sites | \`new_page\` (background) for each site + \`group_tabs\` |
+| Compare two pages side by side | \`new_page\` (background) × 2 + \`group_tabs\` |
+| User says "open a new tab" | \`new_page\` (background) |
+
+**Remember:** The active tab is the New Tab chat UI. Never navigate or close it.`
+    : `### Navigation: single-tab vs multi-tab
+| Task | Approach |
+|------|----------|
+| Look up one page | \`navigate_page\` on current tab |
+| Research across multiple sites | \`new_page\` (background) for each site + \`group_tabs\` |
+| Compare two pages side by side | \`new_page\` (background) × 2 + \`group_tabs\` |
+| User says "open a new tab" | \`new_page\` (background) — don't steal focus |`
+
  return `<tool_selection>
 ## Tool Selection

@@ -268,13 +323,7 @@ function getToolSelection(): string {
 - Prefer \`fill\` over \`press_key\` for text input. Use \`press_key\` for keyboard shortcuts (Enter, Escape, Tab, Ctrl+A, etc.).
 - Prefer clicking links over \`navigate_page\` when the link is visible. Use \`navigate_page\` for direct URL access, back/forward, or reload.

-### Navigation: single-tab vs multi-tab
-| Task | Approach |
-|------|----------|
-| Look up one page | \`navigate_page\` on current tab |
-| Research across multiple sites | \`new_page\` (background) for each site + \`group_tabs\` |
-| Compare two pages side by side | \`new_page\` (background) × 2 + \`group_tabs\` |
-| User says "open a new tab" | \`new_page\` (background) — don't steal focus |
+${navTable}

 ### Connected apps: Strata vs browser
 When an app is Connected, prefer Strata tools over browser automation. Strata is faster, more reliable, and works without navigating away from the user's current page.
@@ -668,7 +717,10 @@ const promptSections: Record<string, PromptSectionFn> = {
  security: getSecurity,
  capabilities: getCapabilities,
  execution: getExecution,
-  'tool-selection': getToolSelection,
+  'tool-selection': (
+    _exclude: Set<string>,
+    options?: BuildSystemPromptOptions,
+  ) => getToolSelection(_exclude, options),
  'external-integrations': getExternalIntegrations,
  'error-recovery': getErrorRecovery,
  'memory-and-identity': getMemoryAndIdentity,
@@ -695,6 +747,8 @@ export interface BuildSystemPromptOptions {
  /** Apps the user previously declined to connect (chose "do it manually"). */
  declinedApps?: string[]
  skillsCatalog?: string
+  /** Where the chat session originates from — determines navigation behavior. */
+  origin?: 'sidepanel' | 'newtab'
 }

 export function buildSystemPrompt(options?: BuildSystemPromptOptions): string {
--- a/packages/browseros-agent/apps/server/src/agent/tool-adapter.ts
+++ b/packages/browseros-agent/apps/server/src/agent/tool-adapter.ts
@@ -39,11 +39,13 @@ export function buildBrowserToolSet(
  registry: ToolRegistry,
  browser: Browser,
  workingDir: string,
+  session?: { origin?: 'sidepanel' | 'newtab'; originPageId?: number },
 ): ToolSet {
  const toolSet: ToolSet = {}
  const ctx: ToolContext = {
    browser,
    directories: { workingDir },
+    session,
  }

  for (const def of registry.all()) {
--- a/packages/browseros-agent/apps/server/src/agent/types.ts
+++ b/packages/browseros-agent/apps/server/src/agent/types.ts
@@ -46,6 +46,8 @@ export interface ResolvedAgentConfig {
  isScheduledTask?: boolean
  /** Apps the user previously declined to connect via MCP (chose "do it manually"). */
  declinedApps?: string[]
+  /** Where the chat session originates from — determines navigation behavior. */
+  origin?: 'sidepanel' | 'newtab'
  /** BrowserOS installation ID for credit-based tracking. */
  browserosId?: string
 }
--- a/packages/browseros-agent/apps/server/src/api/services/chat-service.ts
+++ b/packages/browseros-agent/apps/server/src/api/services/chat-service.ts
@@ -63,6 +63,7 @@ export class ChatService {
      supportsImages: request.supportsImages,
      chatMode: request.mode === 'chat',
      isScheduledTask: request.isScheduledTask,
+      origin: request.origin,
      declinedApps: request.declinedApps,
      browserosId: this.deps.browserosId,
    }
--- a/packages/browseros-agent/apps/server/src/api/types.ts
+++ b/packages/browseros-agent/apps/server/src/api/types.ts
@@ -45,6 +45,7 @@ export const ChatRequestSchema = AgentLLMConfigSchema.extend({
  userWorkingDir: z.string().min(1).optional(),
  supportsImages: z.boolean().optional().default(true),
  mode: z.enum(['chat', 'agent']).optional().default('agent'),
+  origin: z.enum(['sidepanel', 'newtab']).optional().default('sidepanel'),
  declinedApps: z.array(z.string()).optional(),
  selectedText: z.string().optional(),
  selectedTextSource: z
--- a/packages/browseros-agent/apps/server/src/tools/framework.ts
+++ b/packages/browseros-agent/apps/server/src/tools/framework.ts
@@ -22,9 +22,15 @@ export interface ToolDirectories {
  resourcesDir?: string
 }

+export interface ToolSessionContext {
+  origin?: 'sidepanel' | 'newtab'
+  originPageId?: number
+}
+
 export type ToolContext = {
  browser: Browser
  directories: ToolDirectories
+  session?: ToolSessionContext
 }

 export function resolveWorkingPath(
--- a/packages/browseros-agent/apps/server/src/tools/navigation.ts
+++ b/packages/browseros-agent/apps/server/src/tools/navigation.ts
@@ -88,6 +88,17 @@ export const navigate_page = defineTool({
      return
    }

+    if (
+      ctx.session?.origin === 'newtab' &&
+      ctx.session.originPageId !== undefined &&
+      args.page === ctx.session.originPageId
+    ) {
+      response.error(
+        'Cannot navigate the origin tab in new-tab mode — this would destroy the chat UI. Use `new_page` to open a background tab instead.',
+      )
+      return
+    }
+
    switch (args.action) {
      case 'url':
        await ctx.browser.goto(args.page, args.url as string)
@@ -266,6 +277,17 @@ export const close_page = defineTool({
    action: z.literal('close_page'),
  }),
  handler: async (args, ctx, response) => {
+    if (
+      ctx.session?.origin === 'newtab' &&
+      ctx.session.originPageId !== undefined &&
+      args.page === ctx.session.originPageId
+    ) {
+      response.error(
+        'Cannot close the origin tab in new-tab mode — this would destroy the chat UI.',
+      )
+      return
+    }
+
    await ctx.browser.closePage(args.page)
    response.text(`Closed page ${args.page}`)
    response.data({ page: args.page, action: 'close_page' })
--- a/packages/browseros-agent/apps/server/tests/agent/prompt.test.ts
+++ b/packages/browseros-agent/apps/server/tests/agent/prompt.test.ts
@@ -1195,3 +1195,120 @@ describe('nudges', () => {
    expect(prompt).toContain('at most once')
  })
 })
+
+// ---------------------------------------------------------------------------
+// 15. NEW-TAB ORIGIN
+//
+// Why: When the user chats from the new-tab page, the active tab IS the chat
+// UI. The agent must never navigate or close it. The prompt must adapt its
+// execution and tool-selection sections to prohibit origin tab navigation
+// and default all lookups to new_page (background).
+// ---------------------------------------------------------------------------
+
+describe('new-tab origin', () => {
+  /** Build a prompt with newtab origin */
+  function buildNewTab(overrides?: Partial<BuildSystemPromptOptions>): string {
+    return buildSystemPrompt({
+      workspaceDir: '/home/user/workspace',
+      soulContent: 'Be helpful and concise.',
+      origin: 'newtab',
+      ...overrides,
+    })
+  }
+
+  // --- Execution section ---
+
+  it('includes New-Tab Origin Rules when origin is newtab', () => {
+    const prompt = buildNewTab()
+    expect(prompt).toContain('New-Tab Origin Rules')
+    expect(prompt).toContain('New Tab page')
+    expect(prompt).toContain('chat UI itself')
+  })
+
+  it('prohibits navigate_page on active tab in newtab mode', () => {
+    const prompt = buildNewTab()
+    expect(prompt).toContain('NEVER call `navigate_page` on the active tab')
+  })
+
+  it('prohibits close_page on active tab in newtab mode', () => {
+    const prompt = buildNewTab()
+    expect(prompt).toContain('NEVER call `close_page` on the active tab')
+  })
+
+  it('requires new_page for all browsing in newtab mode', () => {
+    const prompt = buildNewTab()
+    expect(prompt).toContain(
+      'For ALL browsing tasks (including single-page lookups), use `new_page`',
+    )
+  })
+
+  it('does NOT include single-tab navigate_page guidance in newtab mode', () => {
+    // The sidepanel prompt says "use navigate_page on the current tab" for
+    // single-page lookups. This must NOT appear in newtab mode.
+    const prompt = buildNewTab()
+    expect(prompt).not.toContain(
+      'For single-page lookups (e.g., "go to X and read Y"), use `navigate_page` on the current tab',
+    )
+  })
+
+  it('does NOT include "Stay on the current page" in newtab mode', () => {
+    const prompt = buildNewTab()
+    expect(prompt).not.toContain(
+      'Stay on the current page for single-page tasks',
+    )
+  })
+
+  it('still includes common execution sections in newtab mode', () => {
+    // Newtab mode should still have multi-tab workflow, observe-act-verify, etc.
+    const prompt = buildNewTab()
+    expect(prompt).toContain('Multi-tab workflow')
+    expect(prompt).toContain('Observe → Act → Verify')
+    expect(prompt).toContain('Tab retry discipline')
+    expect(prompt).toContain('CAPTCHA')
+  })
+
+  // --- Sidepanel (default) should NOT have newtab rules ---
+
+  it('does NOT include New-Tab Origin Rules in sidepanel mode', () => {
+    const prompt = buildRegular({ origin: 'sidepanel' })
+    expect(prompt).not.toContain('New-Tab Origin Rules')
+  })
+
+  it('does NOT include New-Tab Origin Rules when origin is undefined', () => {
+    const prompt = buildRegular()
+    expect(prompt).not.toContain('New-Tab Origin Rules')
+  })
+
+  it('includes single-tab navigate_page guidance in sidepanel mode', () => {
+    const prompt = buildRegular({ origin: 'sidepanel' })
+    expect(prompt).toContain(
+      'For single-page lookups (e.g., "go to X and read Y"), use `navigate_page` on the current tab',
+    )
+  })
+
+  // --- Tool selection section ---
+
+  it('tool selection table uses new_page for lookups in newtab mode', () => {
+    const prompt = buildNewTab()
+    expect(prompt).toContain(
+      '`new_page` (background) → extract data → `close_page`',
+    )
+  })
+
+  it('tool selection includes reminder about active tab in newtab mode', () => {
+    const prompt = buildNewTab()
+    expect(prompt).toContain(
+      'The active tab is the New Tab chat UI. Never navigate or close it.',
+    )
+  })
+
+  it('tool selection table uses navigate_page for lookups in sidepanel mode', () => {
+    const prompt = buildRegular({ origin: 'sidepanel' })
+    expect(prompt).toContain('`navigate_page` on current tab')
+  })
+
+  it('tool selection does NOT have newtab reminder in sidepanel mode', () => {
+    const prompt = buildRegular({ origin: 'sidepanel' })
+    expect(prompt).not.toContain('The active tab is the New Tab chat UI')
+  })
+})
--- a/packages/browseros-agent/apps/server/tests/tools/navigation-newtab-guard.test.ts
+++ b/packages/browseros-agent/apps/server/tests/tools/navigation-newtab-guard.test.ts
@@ -0,0 +1,270 @@
+/**
+ * New-tab origin navigation guards.
+ *
+ * When the chat session originates from the new-tab page, navigate_page and
+ * close_page must reject attempts to act on the origin tab. These are
+ * integration tests that run against a real browser to verify the guards
+ * work end-to-end through executeTool.
+ */
+
+import { describe, it } from 'bun:test'
+import assert from 'node:assert'
+import type { ToolContext, ToolDefinition } from '../../src/tools/framework'
+import { executeTool } from '../../src/tools/framework'
+import { close_page, navigate_page, new_page } from '../../src/tools/navigation'
+import type { ToolResult } from '../../src/tools/response'
+import { withBrowser } from '../__helpers__/with-browser'
+
+function textOf(result: {
+  content: { type: string; text?: string }[]
+}): string {
+  return result.content
+    .filter((c) => c.type === 'text')
+    .map((c) => c.text)
+    .join('\n')
+}
+
+function structuredOf<T>(result: { structuredContent?: unknown }): T {
+  assert.ok(result.structuredContent, 'Expected structuredContent')
+  return result.structuredContent as T
+}
+
+describe('new-tab origin navigation guards', () => {
+  // Helper: execute a tool with newtab session context
+  function executeWithSession(
+    ctx: { browser: ToolContext['browser'] },
+    tool: ToolDefinition,
+    args: unknown,
+    session: ToolContext['session'],
+  ): Promise<ToolResult> {
+    const signal = AbortSignal.timeout(30_000)
+    return executeTool(
+      tool,
+      args,
+      {
+        browser: ctx.browser,
+        directories: { workingDir: process.cwd() },
+        session,
+      },
+      signal,
+    )
+  }
+
+  // -------------------------------------------------------------------------
+  // navigate_page guards
+  // -------------------------------------------------------------------------
+
+  it('navigate_page rejects navigation on origin tab in newtab mode', async () => {
+    await withBrowser(async ({ browser }) => {
+      // Use a new page as the simulated "origin tab"
+      const setupResult = await executeTool(
+        new_page,
+        { url: 'about:blank' },
+        { browser, directories: { workingDir: process.cwd() } },
+        AbortSignal.timeout(30_000),
+      )
+      const originPageId = structuredOf<{ pageId: number }>(setupResult).pageId
+
+      const result = await executeWithSession(
+        { browser },
+        navigate_page,
+        { page: originPageId, action: 'url', url: 'https://example.com' },
+        { origin: 'newtab', originPageId },
+      )
+
+      assert.ok(result.isError, 'Expected navigate_page to be rejected')
+      assert.ok(
+        textOf(result).includes('Cannot navigate the origin tab'),
+        `Expected origin tab error, got: ${textOf(result)}`,
+      )
+
+      // Cleanup
+      await executeTool(
+        close_page,
+        { page: originPageId },
+        { browser, directories: { workingDir: process.cwd() } },
+        AbortSignal.timeout(30_000),
+      )
+    })
+  }, 60_000)
+
+  it('navigate_page allows navigation on non-origin tab in newtab mode', async () => {
+    await withBrowser(async ({ browser }) => {
+      const originResult = await executeTool(
+        new_page,
+        { url: 'about:blank' },
+        { browser, directories: { workingDir: process.cwd() } },
+        AbortSignal.timeout(30_000),
+      )
+      const originPageId = structuredOf<{ pageId: number }>(originResult).pageId
+
+      // Open a second tab — this is NOT the origin tab
+      const otherResult = await executeTool(
+        new_page,
+        { url: 'about:blank' },
+        { browser, directories: { workingDir: process.cwd() } },
+        AbortSignal.timeout(30_000),
+      )
+      const otherPageId = structuredOf<{ pageId: number }>(otherResult).pageId
+
+      const result = await executeWithSession(
+        { browser },
+        navigate_page,
+        { page: otherPageId, action: 'url', url: 'https://example.com' },
+        { origin: 'newtab', originPageId },
+      )
+
+      assert.ok(
+        !result.isError,
+        `Expected success, got error: ${textOf(result)}`,
+      )
+      assert.ok(textOf(result).includes('Navigated to'))
+
+      // Cleanup
+      const noSession = { browser, directories: { workingDir: process.cwd() } }
+      await executeTool(
+        close_page,
+        { page: otherPageId },
+        noSession,
+        AbortSignal.timeout(30_000),
+      )
+      await executeTool(
+        close_page,
+        { page: originPageId },
+        noSession,
+        AbortSignal.timeout(30_000),
+      )
+    })
+  }, 60_000)
+
+  it('navigate_page works normally in sidepanel mode', async () => {
+    await withBrowser(async ({ browser }) => {
+      const setupResult = await executeTool(
+        new_page,
+        { url: 'about:blank' },
+        { browser, directories: { workingDir: process.cwd() } },
+        AbortSignal.timeout(30_000),
+      )
+      const pageId = structuredOf<{ pageId: number }>(setupResult).pageId
+
+      const result = await executeWithSession(
+        { browser },
+        navigate_page,
+        { page: pageId, action: 'url', url: 'https://example.com' },
+        { origin: 'sidepanel', originPageId: pageId },
+      )
+
+      assert.ok(
+        !result.isError,
+        `Expected success, got error: ${textOf(result)}`,
+      )
+      assert.ok(textOf(result).includes('Navigated to'))
+
+      await executeTool(
+        close_page,
+        { page: pageId },
+        { browser, directories: { workingDir: process.cwd() } },
+        AbortSignal.timeout(30_000),
+      )
+    })
+  }, 60_000)
+
+  it('navigate_page works when session is undefined (backwards compat)', async () => {
+    await withBrowser(async ({ browser, execute }) => {
+      const setupResult = await execute(new_page, { url: 'about:blank' })
+      const pageId = structuredOf<{ pageId: number }>(setupResult).pageId
+
+      // execute() from withBrowser passes no session — simulates old clients
+      const result = await execute(navigate_page, {
+        page: pageId,
+        action: 'url',
+        url: 'https://example.com',
+      })
+
+      assert.ok(
+        !result.isError,
+        `Expected success, got error: ${textOf(result)}`,
+      )
+
+      await execute(close_page, { page: pageId })
+    })
+  }, 60_000)
+
+  // -------------------------------------------------------------------------
+  // close_page guards
+  // -------------------------------------------------------------------------
+
+  it('close_page rejects closing origin tab in newtab mode', async () => {
+    await withBrowser(async ({ browser }) => {
+      const setupResult = await executeTool(
+        new_page,
+        { url: 'about:blank' },
+        { browser, directories: { workingDir: process.cwd() } },
+        AbortSignal.timeout(30_000),
+      )
+      const originPageId = structuredOf<{ pageId: number }>(setupResult).pageId
+
+      const result = await executeWithSession(
+        { browser },
+        close_page,
+        { page: originPageId },
+        { origin: 'newtab', originPageId },
+      )
+
+      assert.ok(result.isError, 'Expected close_page to be rejected')
+      assert.ok(
+        textOf(result).includes('Cannot close the origin tab'),
+        `Expected origin tab error, got: ${textOf(result)}`,
+      )
+
+      // Clean up the page we created (without newtab guard)
+      await executeTool(
+        close_page,
+        { page: originPageId },
+        { browser, directories: { workingDir: process.cwd() } },
+        AbortSignal.timeout(30_000),
+      )
+    })
+  }, 60_000)
+
+  it('close_page allows closing non-origin tab in newtab mode', async () => {
+    await withBrowser(async ({ browser }) => {
+      const originResult = await executeTool(
+        new_page,
+        { url: 'about:blank' },
+        { browser, directories: { workingDir: process.cwd() } },
+        AbortSignal.timeout(30_000),
+      )
+      const originPageId = structuredOf<{ pageId: number }>(originResult).pageId
+
+      const otherResult = await executeTool(
+        new_page,
+        { url: 'about:blank' },
+        { browser, directories: { workingDir: process.cwd() } },
+        AbortSignal.timeout(30_000),
+      )
+      const otherPageId = structuredOf<{ pageId: number }>(otherResult).pageId
+
+      const result = await executeWithSession(
+        { browser },
+        close_page,
+        { page: otherPageId },
+        { origin: 'newtab', originPageId },
+      )
+
+      assert.ok(
+        !result.isError,
+        `Expected success, got error: ${textOf(result)}`,
+      )
+      assert.ok(textOf(result).includes(`Closed page ${otherPageId}`))
+
+      // Cleanup origin page
+      await executeTool(
+        close_page,
+        { page: originPageId },
+        { browser, directories: { workingDir: process.cwd() } },
+        AbortSignal.timeout(30_000),
+      )
+    })
+  }, 60_000)
+})