Refactor LLM route-first provider API (#28523)

2026-05-21 03:15:11 +00:00 · 2026-05-20 20:15:52 -04:00
parent 5381795844
commit 41f6daf96a
87 changed files with 2450 additions and 1520 deletions
--- a/packages/llm/AGENTS.md
+++ b/packages/llm/AGENTS.md
@@ -10,7 +10,7 @@

 ## Conventions

-Per-type constructors live on the type's namespace, not as top-level re-exports. Use `Message.user(...)`, `Message.assistant(...)`, `Message.tool(...)`, `ToolDefinition.make(...)`, `ToolCallPart.make(...)`, `ToolResultPart.make(...)`, `ToolChoice.make(...)`, `ToolChoice.named(...)`, `SystemPart.make(...)`, and `GenerationOptions.make(...)` directly. The top-level `LLM` namespace is reserved for the request-shaped call API: `LLM.request`, `LLM.generate`, `LLM.stream`, `LLM.model`, `LLM.updateRequest`, `LLM.generateObject`. Two ways to construct the same thing is one too many.
+Per-type constructors live on the type, not as top-level re-exports. Use `Message.user(...)`, `Message.assistant(...)`, `Message.tool(...)`, `Model.make(...)`, `ToolDefinition.make(...)`, `ToolCallPart.make(...)`, `ToolResultPart.make(...)`, `ToolChoice.make(...)`, `ToolChoice.named(...)`, `SystemPart.make(...)`, and `GenerationOptions.make(...)` directly. The top-level `LLM` namespace is reserved for request-shaped call APIs: `LLM.request`, `LLM.generate`, `LLM.stream`, `LLM.updateRequest`, and `LLM.generateObject`. Two ways to construct the same thing is one too many.

 ## Tests

@@ -21,13 +21,22 @@ Per-type constructors live on the type's namespace, not as top-level re-exports.

 This package is an Effect Schema-first LLM core. The Schema classes in `src/schema/` are the canonical runtime data model. Convenience functions in `src/llm.ts` are thin constructors that return those same Schema class instances; they should improve callsites without creating a second model.

+Primary in-repo integration point:
+
+- `packages/opencode/src/session/llm.ts` is the session-owned orchestration layer that decides whether a request uses AI SDK or this package's native route runtime.
+- `packages/opencode/src/session/llm/native-request.ts` is the lowering adapter from opencode's session/AI SDK-shaped data into this package's `LLMRequest` model.
+- `packages/opencode/src/session/llm/native-runtime.ts` is the execution adapter that calls `LLMClient.stream(...)` and bridges opencode tools into this package's tool runtime.
+- `packages/opencode/src/session/llm/ai-sdk.ts` keeps the default AI SDK path compatible by converting AI SDK stream parts into this package's shared `LLMEvent`s.
+
+Keep this package independent of session concerns. Session auth, permissions, plugins, telemetry headers, and runtime selection belong in `packages/opencode/src/session/llm.ts` and its local adapters.
+
 ### Request Flow

 The intended callsite is:

 ```ts
 const request = LLM.request({
-  model: OpenAI.model("gpt-4o-mini", { apiKey }),
+  model: OpenAI.configure({ apiKey }).responses("gpt-4o-mini"),
  system: "You are concise.",
  prompt: "Say hello.",
 })
@@ -35,7 +44,7 @@ const request = LLM.request({
 const response = yield * LLMClient.generate(request)
 ```

-`LLM.request(...)` builds an `LLMRequest`. `LLMClient.generate(...)` selects a registered route by `request.model.route`, builds the provider-native body, asks the route's transport for a real `HttpClientRequest.HttpClientRequest`, sends it through `RequestExecutor.Service`, parses the provider stream into common `LLMEvent`s, and finally returns an `LLMResponse`.
+`LLM.request(...)` builds an `LLMRequest`. `LLMClient.generate(...)` reads the executable route carried by `request.model.route`, builds the provider-native body, asks the route's transport for a real `HttpClientRequest.HttpClientRequest`, sends it through `RequestExecutor.Service`, parses the provider stream into common `LLMEvent`s, and finally returns an `LLMResponse`.

 Use `LLMClient.stream(request)` when callers want incremental `LLMEvent`s. Use `LLMClient.generate(request)` when callers want those same events collected into an `LLMResponse`. Use `LLMClient.prepare<Body>(request)` to compile a request through the route pipeline without sending it — the optional `Body` type argument narrows `.body` to the route's native shape (e.g. `prepare<OpenAIChatBody>(...)` returns a `PreparedRequestOf<OpenAIChatBody>`). The runtime body is identical; the generic is a type-level assertion.

@@ -46,8 +55,8 @@ Filter or narrow `LLMEvent` streams with `LLMEvent.is.*` (camelCase guards, e.g.
 A route is the registered, runnable composition of four orthogonal pieces:

 - **`Protocol`** (`src/route/protocol.ts`) — semantic API contract. Owns request body construction (`body.from`), the body schema (`body.schema`), the streaming-event schema (`stream.event`), and the event-to-`LLMEvent` state machine (`stream.step`). `Route.make(...)` validates and JSON-encodes the body from `body.schema` and decodes frames with `stream.event`. Examples: `OpenAIChat.protocol`, `OpenAIResponses.protocol`, `AnthropicMessages.protocol`, `Gemini.protocol`, `BedrockConverse.protocol`.
- **`Endpoint`** (`src/route/endpoint.ts`) — path construction. The host always lives on `model.baseURL`; the endpoint just supplies the path. `Endpoint.path("/chat/completions")` is the common case; pass a function for paths that embed the model id or a body field (e.g. `Endpoint.path(({ body }) => `/model/${body.modelId}/converse-stream`)`).
- **`Auth`** (`src/route/auth.ts`) — per-request transport authentication. Routes read `model.apiKey` at request time via `Auth.bearer` (the default; sets `Authorization: Bearer <apiKey>`) or `Auth.apiKeyHeader(name)` for providers that use a custom header (Anthropic `x-api-key`, Gemini `x-goog-api-key`). Routes that need per-request signing (Bedrock SigV4, future Vertex IAM, Azure AAD) implement `Auth` as a function that signs the body and merges signed headers into the result.
+- **`Endpoint`** (`src/route/endpoint.ts`) — URL construction. The host, path, and route query live on the endpoint. `Endpoint.path("/chat/completions", { baseURL })` is the common case; pass a function for paths that embed the model id or a body field (e.g. `Endpoint.path(({ body }) => `/model/${body.modelId}/converse-stream`)`).
+- **`Auth`** (`src/route/auth.ts`) — per-request transport authentication. Provider facades configure credentials onto the route before model selection, usually via `Auth.bearer(apiKey)` or `Auth.header(name, apiKey)`. Routes that need per-request signing (Bedrock SigV4, future Vertex IAM, Azure AAD) implement `Auth` as a function that signs the body and merges signed headers into the result.
 - **`Framing`** (`src/route/framing.ts`) — bytes → frames. SSE (`Framing.sse`) is shared; Bedrock keeps its AWS event-stream framing as a typed `Framing<object>` value alongside its protocol.

 Compose them via `Route.make(...)`:
@@ -57,55 +66,52 @@ export const route = Route.make({
  id: "openai-chat",
  provider: "openai",
  protocol: OpenAIChat.protocol,
-  transport: HttpTransport.httpJson({
-    endpoint: Endpoint.path("/chat/completions"),
-    auth: Auth.bearer(),
-    framing: Framing.sse,
-    encodeBody,
-  }),
-  defaults: {
+  endpoint: Endpoint.path("/chat/completions", {
    baseURL: "https://api.openai.com/v1",
-    capabilities: capabilities({ tools: { calls: true, streamingInput: true } }),
-  },
+  }),
+  auth: Auth.bearer(),
+  framing: Framing.sse,
 })
 ```

+Route defaults are request-shaping defaults such as `headers`, `limits`, `generation`, `providerOptions`, and `http`. Endpoint host/query belongs on the route endpoint. Selected `Model` values carry only model id, provider id, and the configured route value. Model capability/catalog metadata lives outside this package; protocol support is enforced by request lowering and typed `LLMError`s.
+
 The four-axis decomposition is the reason DeepSeek, TogetherAI, Cerebras, Baseten, Fireworks, and DeepInfra all reuse `OpenAIChat.protocol` verbatim — each provider deployment is a 5-15 line `Route.make(...)` call instead of a 300-400 line route clone. Bug fixes in one protocol propagate to every consumer of that protocol in a single commit.

-When a provider ships a non-HTTP transport (OpenAI's WebSocket Responses backend, hypothetical bidirectional streaming APIs), the seam is `Transport` — `WebSocketTransport.json(...)` constructs a transport whose `prepare` builds a WebSocket URL and message and whose `frames` yields decoded text from the socket. Same protocol, different transport.
+When a provider ships a non-HTTP transport (OpenAI's WebSocket Responses backend, hypothetical bidirectional streaming APIs), the seam is `Transport` — `WebSocketTransport.jsonTransport.with(...)` constructs an IO template whose `prepare` receives the route endpoint/auth at compile time, builds a WebSocket URL and message, and whose `frames` yields decoded text from the socket. Same protocol and endpoint source, different transport.

 ### URL Construction

-`model.baseURL` is required; `Endpoint` only carries the path. Each protocol's `Route.make` includes a canonical URL in `defaults.baseURL` (e.g. `https://api.openai.com/v1`); provider helpers can override by passing `baseURL` in their input. Routes that have no canonical URL (OpenAI-compatible Chat, GitHub Copilot) set `baseURL: string` (required) on their input type so TypeScript catches a missing host at the call site.
+`Endpoint` owns `{ baseURL, path, query }`. Each protocol route includes a canonical endpoint when the provider has one (e.g. `https://api.openai.com/v1`); provider helpers override endpoint fields by configuring the route before selecting a model. Routes that have no canonical URL (OpenAI-compatible Chat, GitHub Copilot) require configuration before execution.

-For providers where the URL is derived from typed inputs (Azure resource name, Bedrock region), the provider helper computes `baseURL` at model construction time. Use `AtLeastOne<T>` from `route/auth-options.ts` for inputs that accept either of two derivation paths (Azure: `resourceName` or `baseURL`).
+For providers where the URL is derived from typed inputs (Azure resource name, Bedrock region), the provider helper configures the route endpoint before calling `.model(...)`. Use `AtLeastOne<T>` from `route/auth-options.ts` for inputs that accept either of two derivation paths (Azure: `resourceName` or `baseURL`).

-### Provider Definitions
+### Provider Facades

-Provider-facing APIs are defined with `Provider.make(...)` from `src/provider.ts`:
+Provider-facing APIs are configured facades over route values. Endpoint/auth/resource/API-version setup happens before model selection, and model selectors accept only a model or deployment id:

 ```ts
-export const provider = Provider.make({
-  id: ProviderID.make("openai"),
-  model: responses,
-  apis: { responses, chat },
-})
+const openai = OpenAI.configure({ apiKey, baseURL })
+const model = openai.responses("gpt-4o-mini")

-export const model = provider.model
-export const apis = provider.apis
+const azure = Azure.configure({ resourceName, apiKey, apiVersion: "v1" })
+const deployment = azure.responses("my-deployment")
+
+const gateway = CloudflareAIGateway.configure({ accountId, gatewayId, gatewayApiKey, apiKey })
+const proxied = gateway.model("openai/gpt-4o-mini")
 ```

-Keep provider definitions small and explicit:
+Keep provider facades small and explicit:

- Use only `id`, `model`, and optional `apis` in `Provider.make(...)`.
 - Use branded `ProviderID.make(...)` and `ModelID.make(...)` where ids are constructed directly.
- Use `model` for the default API path and `apis` for named provider-native alternatives such as OpenAI `responses` versus `chat`.
- Do not add author-facing `kind`, `version`, or `routes` fields.
+- Use `model` for the default API path and named methods for provider-native alternatives such as OpenAI `responses`, `responsesWebSocket`, and `chat`.
+- Put provider-specific setup on `.configure(...)`; do not add `model(id, overrides)` as a duplicate construction path.
 - Export lower-level `routes` arrays separately only when advanced internal wiring needs them.
 - Prefer `apiKey` as provider-specific sugar and `auth` as the explicit override; keep them mutually exclusive in provider option types with `ProviderAuthOption`.
 - Resolve `apiKey` → `Auth` with `AuthOptions.bearer(options, "<PROVIDER>_API_KEY")` (it honors an explicit `auth` override and falls back to `Auth.config(envVar)` so missing keys surface a typed `Authentication` error rather than a runtime crash).
+- Use separate top-level facades for products with different required setup, such as `CloudflareAIGateway` and `CloudflareWorkersAI`.

-Built-in providers are namespace modules from `src/providers/index.ts`, so aliases like `OpenAI.model(...)`, `OpenAI.responses(...)`, and `OpenAI.apis.chat(...)` are fine. External provider packages should default-export the `Provider.make(...)` result and may add named aliases if useful.
+`Provider.make(...)` remains available for simple static provider definitions, but new built-in providers should prefer plain configured facades unless a helper removes real duplication without adding runtime behavior.

 ### Folder layout

@@ -113,7 +119,7 @@ Built-in providers are namespace modules from `src/providers/index.ts`, so alias
 packages/llm/src/
  schema/                   canonical Schema model, split by concern
    ids.ts                  branded IDs, literal types, ProviderMetadata
-    options.ts              Generation/Provider/Http options, Capabilities, Limits, ModelRef
+    options.ts              Generation/Provider/Http options, Limits, Model, cache policy
    messages.ts             content parts, Message, ToolDefinition, LLMRequest
    events.ts               Usage, individual events, LLMEvent, PreparedRequest, LLMResponse
    errors.ts               error reasons, LLMError, ToolFailure
@@ -145,12 +151,12 @@ packages/llm/src/
  providers/
    openai-compatible.ts    generic compatible helper + family model helpers
    openai-compatible-profile.ts family defaults (deepseek, togetherai, ...)
-    azure.ts / amazon-bedrock.ts / github-copilot.ts / google.ts / xai.ts / openai.ts / anthropic.ts / openrouter.ts
+    azure.ts / amazon-bedrock.ts / cloudflare.ts / github-copilot.ts / google.ts / xai.ts / openai.ts / anthropic.ts / openrouter.ts
  tool.ts                   typed tool() helper
  tool-runtime.ts           implementation helpers for LLMClient tool execution
 ```

-The dependency arrow points down: `providers/*.ts` files import `protocols`, `endpoint`, `auth`, and `framing`; protocols do not import provider metadata. Lower-level modules know nothing about specific providers.
+The dependency arrow points down: `providers/*.ts` files import protocol routes and auth-option utilities; protocol modules import `endpoint`, `auth`, `framing`, and transport pieces. Protocols do not import provider facades. Lower-level modules know nothing about provider catalog metadata.

 ### Shared protocol helpers

@@ -245,14 +251,14 @@ Use this order for every protocol module:
 5. Request body construction (`fromRequest`)
 6. Stream parsing (`step` and per-event handlers)
 7. Protocol and route
-8. Model helper
+8. Protocol route export

 ### Rules

 - Keep protocol files focused on the protocol. Move provider-specific projection, signing, media normalization, or other bulky transformations into `src/protocols/utils/*`.
 - Use `Effect.fn("Provider.fromRequest")` for request body construction entrypoints. Use `Effect.fn(...)` for event handlers that yield effects; keep purely synchronous handlers as plain functions returning a `StepResult` that the dispatcher lifts via `Effect.succeed(...)`.
- Parser state owns terminal information. The state machine records finish reason, usage, and pending tool calls; emit one terminal `request-finish` (or `provider-error`) when a `terminal` event arrives. If a provider splits reason and usage across events, merge them in parser state before flushing.
- Emit exactly one terminal `request-finish` event for a completed response. Use `stream.terminal` to signal the run is over and have `step` emit the final event.
+- Parser state owns terminal information. The state machine records finish reason, usage, and pending tool calls; emit one terminal `finish` event (or `provider-error`) for each completed response. If a provider splits reason and usage across events, merge them in parser state before flushing.
+- Emit exactly one terminal `finish` event for a completed response, normally after a matching `step-finish`. Use `stream.terminal` to stop reading when the provider has a completion sentinel; use `stream.onHalt` when the final event must be flushed after the framed stream ends.
 - Use shared helpers for repeated protocol policy such as text joining, usage totals, JSON parsing, and tool-call accumulation. `ToolStream` (`protocols/utils/tool-stream.ts`) accumulates streamed tool-call arguments uniformly.
 - Make intentional provider differences explicit in helper names or comments. If two protocol files differ visually, the reason should be obvious from the names.
 - Prefer dispatched per-event handlers (`onMessageStart`, `onContentBlockDelta`, ...) called from a small top-level `step` switch over a long if-chain. The dispatcher keeps the event surface visible at a glance.
--- a/packages/llm/README.md
+++ b/packages/llm/README.md
@@ -7,7 +7,7 @@ import { Effect } from "effect"
 import { LLM, LLMClient } from "@opencode-ai/llm"
 import { OpenAI } from "@opencode-ai/llm/providers"

-const model = OpenAI.model("gpt-4o-mini", { apiKey: process.env.OPENAI_API_KEY })
+const model = OpenAI.configure({ apiKey: process.env.OPENAI_API_KEY }).responses("gpt-4o-mini")

 const request = LLM.request({
  model,
@@ -28,10 +28,10 @@ Run `LLMClient.stream(request)` instead of `generate` when you want incremental

 - **`LLM.request({...})`** — build a provider-neutral `LLMRequest`. Accepts ergonomic inputs (`system: string`, `prompt: string`) that normalize into the canonical Schema classes.
 - **`LLM.generate` / `LLM.stream`** — re-exported from `LLMClient` for one-import use.
- **`LLM.user(...)` / `LLM.assistant(...)` / `LLM.toolMessage(...)`** — message constructors.
- **`LLM.toolCall(...)` / `LLM.toolResult(...)` / `LLM.toolDefinition(...)`** — tool-related parts.
+- **`Message.user(...)` / `Message.assistant(...)` / `Message.tool(...)`** — message constructors from the canonical schema model.
+- **`Model.make(...)` / `ToolCallPart.make(...)` / `ToolResultPart.make(...)` / `ToolDefinition.make(...)`** — model and tool-related constructors from the canonical schema model.
 - **`LLMClient.prepare(request)`** — compile a request through protocol body construction, validation, and HTTP preparation without sending. Useful for inspection and testing.
- **`LLMEvent.is.*`** — typed guards (`is.text`, `is.toolCall`, `is.requestFinish`, …) for filtering streams.
+- **`LLMEvent.is.*`** — typed guards (`is.textDelta`, `is.toolCall`, `is.finish`, …) for filtering streams.

 ## Caching

@@ -92,17 +92,19 @@ Normalized cache usage is read back into `response.usage.cacheReadInputTokens` a

 ## Providers

-Each provider exports a `model(...)` helper that records identity, protocol, capabilities, auth, and defaults.
+Provider facades configure endpoint/auth/deployment details first, then expose model selectors that take only a model or deployment id. The selected model carries the executable route value used at runtime.

 ```ts
-import { Anthropic } from "@opencode-ai/llm/providers"
+import { OpenAI, CloudflareAIGateway } from "@opencode-ai/llm/providers"

-const model = Anthropic.model("claude-sonnet-4-6", {
-  apiKey: process.env.ANTHROPIC_API_KEY,
-})
+const openai = OpenAI.configure({ apiKey: process.env.OPENAI_API_KEY }).responses("gpt-4o-mini")
+const gateway = CloudflareAIGateway.configure({
+  accountId: process.env.CLOUDFLARE_ACCOUNT_ID,
+  gatewayApiKey: process.env.CLOUDFLARE_API_TOKEN,
+}).model("workers-ai/@cf/meta/llama-3.1-8b-instruct")
 ```

-Included providers: OpenAI, Anthropic, Google (Gemini), Amazon Bedrock, Azure OpenAI, Cloudflare, GitHub Copilot, OpenRouter, xAI, plus generic OpenAI-compatible helpers for DeepSeek, Cerebras, Groq, Fireworks, Together, etc.
+Included providers: OpenAI, Anthropic, Google (Gemini), Amazon Bedrock, Azure OpenAI, Cloudflare AI Gateway, Cloudflare Workers AI, GitHub Copilot, OpenRouter, xAI, plus generic OpenAI-compatible helpers for DeepSeek, Cerebras, Groq, Fireworks, Together, etc.

 ## Provider options & HTTP overlays

@@ -112,15 +114,15 @@ Three escape hatches in order of stability:
 2. **`providerOptions: { <provider>: {...} }`** — typed-at-the-facade provider-specific knobs (OpenAI `promptCacheKey`, Anthropic `thinking`, Gemini `thinkingConfig`, OpenRouter routing).
 3. **`http: { body, headers, query }`** — last-resort serializable overlays merged into the final HTTP request. Reach for this only when a stable typed path doesn't yet exist.

-Model-level defaults are overridden by request-level values for each axis.
+Route/provider defaults are overridden by request-level values for each axis.

 ## Routes

-Adding a new model or deployment is usually 5–15 lines using `Route.make({ protocol, transport, ... })`. The four orthogonal pieces are protocol (body construction + stream parsing), transport (endpoint + auth + framing + encoding), defaults, and capabilities. See `AGENTS.md` for the architectural detail.
+Adding a new model or deployment is usually 5-15 lines using `Route.make({ protocol, endpoint, auth, framing, ... })`. The route owns endpoint/auth/framing and the protocol owns body construction plus stream parsing. Transports are reusable IO templates that receive route endpoint/auth at compile time. Capability/catalog metadata lives outside this low-level package; unsupported request shapes fail during protocol lowering. See `AGENTS.md` for the architectural detail.

 ## Effect

-This package is built on Effect. Public methods return `Effect` or `Stream`; provide `LLMClient.layer` (the default registers every shipped route) for runtime dispatch. The example at `example/tutorial.ts` is a runnable walkthrough.
+This package is built on Effect. Public methods return `Effect` or `Stream`; provide `LLMClient.layer` for runtime dispatch and import the provider/protocol modules for the routes you use. The example at `example/tutorial.ts` is a runnable walkthrough.

 ## See also

--- a/packages/llm/example/call-sites.md
+++ b/packages/llm/example/call-sites.md
@@ -0,0 +1,591 @@
+# LLM Call Site Sketches
+
+Scratchpad for examples first, abstractions second. Current direction: routes
+execute, provider facades organize configured route sets, and models carry route
+values directly.
+
+## Conversation Summary
+
+Kit and Aidan want provider-specific LLM behavior to move out of opencode's AI
+SDK transform path and into `packages/llm` where possible. The goal is not a big
+generic transform layer; the goal is small composable route definitions backed by
+recorded golden tests.
+
+Things to keep testing against:
+
+- Cache placement: `cache: "auto"`, manual cache breakpoints, provider cache usage.
+- Images: golden image tests for providers/protocols that claim image support.
+- Reasoning: canonical reasoning parts/events versus provider-native knobs.
+- Auth: bearer, custom headers, multiple credentials, query auth, SigV4, OAuth, no auth.
+- OpenAI-compatible providers: DeepSeek, Together, Groq, Alibaba/DashScope, custom routers.
+- Provider switching: stale signatures, encrypted reasoning, provider metadata, incompatible parts.
+- Error quality: typed errors instead of generic SDK/server failures.
+
+## Final Guide: Routes Execute, Providers Organize
+
+Do not introduce a first-class `Deployment` abstraction unless it gains real
+semantics. Provider facades are ergonomic configured route groups, not execution
+registries. The executable/composable thing is still a route. Do not make route
+construction publish to a global registry; models should carry their route value
+directly.
+
+Keep durable identity separate from runtime capability:
+
+- Durable identity is small serializable data like `{ providerID, modelID }` for
+  config, sessions, logs, and catalogs.
+- Runtime capability is a `Model` with a route value, protocol, transport, auth,
+  and defaults. It is allowed to contain functions and schemas.
+- If persisted identity needs to become executable, resolve it through an app
+  boundary first. Do not make `LLMRequest` recover behavior from a global route
+  side table.
+
+Keep unconfigured behavior values as values, not factories. A transport like
+`HttpTransport.sseJson` should be a reusable immutable value. Use a function only
+when the caller supplies options or when construction needs fresh state.
+
+Use constants to remove repetition before inventing abstractions. Provider ids
+are branded once per provider facade and reused across routes; a plain exported
+object is enough for the provider-facing API unless a helper earns its keep by
+removing repeated route projection.
+
+Expose default configured provider instances, and put provider-specific setup on
+`.configure(...)`. Model selectors stay pure: `model(id)`, `responses(id)`,
+`chat(id)`, etc. Endpoint/auth/resource/api-version configuration happens before
+model selection, not as a second argument to model selection.
+
+Use provider/product facades consistently:
+
+- One coherent provider/product config surface gets one top-level facade.
+- APIs/model kinds that share that config are methods on the facade.
+- Different products with different required config get separate top-level
+  facades, not a shared namespace with unrelated children.
+- Default facades are exposed only when concrete defaults or lazy env/credential
+  defaults make the facade valid.
+
+Examples:
+
+```ts
+OpenAI.responses("gpt-4o")
+OpenAI.chat("gpt-4o")
+OpenAI.responsesWebSocket("gpt-4o")
+
+Azure.configure({ resourceName, apiKey }).responses("my-deployment")
+AmazonBedrock.configure({ region, credentials }).model("anthropic.claude-3-5-sonnet-20241022-v2:0")
+
+CloudflareAIGateway.configure({ accountId, gatewayId, gatewayApiKey, apiKey }).model("openai/gpt-4o")
+CloudflareWorkersAI.configure({ accountId, apiKey }).model("@cf/meta/llama-3.1-8b-instruct")
+
+OpenAICompatible.configure({
+  provider: "custom",
+  baseURL: "https://custom.example/v1",
+  auth: Auth.bearer(apiKey),
+}).model("custom-model")
+```
+
+Standardize the provider facade contract before abstracting construction. A
+plain object is enough at first; add a helper only if repeated route projection
+starts hiding the real provider-specific config.
+
+`Route.with(...)` patch semantics should be boring and explicit:
+
+- Omitted fields inherit from the original route.
+- `endpoint` patches merge with the existing endpoint, so overriding `baseURL`
+  keeps the existing `path`.
+- `endpoint.query` merges by default; later values win.
+- `auth` replaces.
+- `headers` merge by default; undefined values are omitted.
+- `id` is optional in patches. Route ids are diagnostic/provider API labels, not
+  global runtime registry keys.
+
+1. **Route**
+   - route id
+   - provider id
+   - protocol
+   - body schema
+   - body builder
+   - stream event schema
+   - parser/state machine
+   - transport
+     - method / IO shape
+     - framing
+     - request preparation
+     - constants when unconfigured; functions only when configured
+   - endpoint
+     - base URL
+     - static path
+     - body/model-derived path
+     - query params
+   - auth
+     - bearer
+     - custom header
+     - multiple credentials
+     - SigV4
+     - none
+   - defaults
+     - headers
+     - generation defaults
+     - provider options
+     - limits
+2. **Provider Facade**
+   - default configured provider instance
+   - provider-specific `.configure(...)`
+   - plain object/function facade over one or more routes
+   - top-level export only when it represents one coherent config surface
+   - no passive `Provider.make(...)` wrapper unless it gains runtime behavior
+3. **Model Selector**
+   - route/provider-owned selector
+   - accepts model id only
+   - returns executable models
+   - does not accept endpoint/auth/deployment overrides
+4. **Model**
+   - model id
+   - route value
+   - provider id
+   - configured route value at selection time
+5. **LLM Request**
+   - model
+   - messages/tools
+   - generation/cache/reasoning/response-format options
+   - request-level HTTP overlays for per-request headers/query/body additions,
+     not provider endpoint/auth reconfiguration
+6. **Compile**
+   - read route from model
+   - merge route defaults and request overrides
+   - build final URL from route endpoint
+   - apply auth from the configured route
+   - build body with protocol
+   - execute with transport and parse with protocol
+
+## Provider Facade Shape
+
+The provider abstraction is a facade over configured routes, not the runtime
+execution mechanism:
+
+```ts
+type ProviderFacade<APIs, Config> = {
+  readonly id: ProviderID
+  readonly model: (id: string) => Model
+  readonly configure: (input?: Config) => ProviderFacade<APIs, Config>
+} & APIs
+```
+
+Manual construction is fine and should be the default until duplication earns a
+helper:
+
+```ts
+export const OpenAI = {
+  id: openAIProvider,
+  model: openAIResponses.model,
+  responses: openAIResponses.model,
+  chat: openAIChat.model,
+  configure: configureOpenAI,
+} satisfies ProviderFacade<
+  {
+    responses: (id: string) => Model
+    chat: (id: string) => Model
+  },
+  OpenAIConfig
+>
+```
+
+If several providers repeat the same projection from route values to model
+methods, the helper can stay deliberately tiny:
+
+```ts
+const configureOpenAI = (input: OpenAIConfig = {}) =>
+  Provider.define({
+    id: openAIProvider,
+    routes: {
+      responses: openAIResponses.with(openAIConfig(input)),
+      chat: openAIChat.with(openAIConfig(input)),
+    },
+    default: "responses",
+    configure: configureOpenAI,
+  })
+
+export const OpenAI = configureOpenAI()
+```
+
+`Provider.define(...)` would only project route methods and preserve types:
+
+```ts
+OpenAI.model("gpt-4o")
+OpenAI.responses("gpt-4o")
+OpenAI.chat("gpt-4o")
+OpenAI.configure({ apiKey }).responses("gpt-4o")
+```
+
+It must not register routes, select routes dynamically, or participate in
+execution. Execution still reads the route value carried by the model.
+
+## Ideal Call Sites
+
+Define concrete routes for a native provider, then project them through a
+provider facade:
+
+```ts
+const openAIProvider = ProviderID.make("openai")
+
+const openAIResponses = Route.make({
+  id: "openai-responses",
+  provider: openAIProvider,
+  protocol: OpenAIResponses.protocol,
+  transport: HttpTransport.sseJson,
+  endpoint: {
+    baseURL: "https://api.openai.com/v1",
+    path: "/responses",
+  },
+  auth: Auth.envBearer("OPENAI_API_KEY"),
+})
+
+const openAIChat = Route.make({
+  id: "openai-chat",
+  provider: openAIProvider,
+  protocol: OpenAIChat.protocol,
+  transport: HttpTransport.sseJson,
+  endpoint: {
+    baseURL: "https://api.openai.com/v1",
+    path: "/chat/completions",
+  },
+  auth: Auth.envBearer("OPENAI_API_KEY"),
+})
+
+const openAIResponsesWebSocket = openAIResponses.with({
+  id: "openai-responses-websocket",
+  transport: WebSocketTransport.json,
+})
+
+const openAIConfig = (input: OpenAIConfig) => ({
+  endpoint: input.endpoint,
+  auth: input.auth ?? (input.apiKey ? Auth.bearer(input.apiKey) : undefined),
+  headers: {
+    "OpenAI-Organization": input.organization,
+    "OpenAI-Project": input.project,
+  },
+})
+
+const configureOpenAI = (input: OpenAIConfig = {}) => {
+  const responses = openAIResponses.with(openAIConfig(input))
+  const responsesWebSocket = openAIResponsesWebSocket.with(openAIConfig(input))
+  const chat = openAIChat.with(openAIConfig(input))
+
+  return {
+    id: openAIProvider,
+    responses: responses.model,
+    responsesWebSocket: responsesWebSocket.model,
+    chat: chat.model,
+    model: responses.model,
+    configure: configureOpenAI,
+  }
+}
+
+export const OpenAI = configureOpenAI()
+```
+
+Specialize it functionally for concrete providers:
+
+```ts
+const deepSeekProvider = ProviderID.make("deepseek")
+
+const deepseekChat = openAIChat.with({
+  id: "deepseek-chat",
+  provider: deepSeekProvider,
+  endpoint: {
+    baseURL: "https://api.deepseek.com/v1",
+  },
+  auth: Auth.envBearer("DEEPSEEK_API_KEY"),
+})
+
+const configureDeepSeek = (input: OpenAICompatibleConfig = {}) => {
+  const route = deepseekChat.with({
+    endpoint: input.endpoint,
+    auth: input.auth ?? (input.apiKey ? Auth.bearer(input.apiKey) : undefined),
+  })
+
+  return {
+    id: deepSeekProvider,
+    model: route.model,
+    configure: configureDeepSeek,
+  }
+}
+
+export const DeepSeek = {
+  id: deepSeekProvider,
+  model: deepseekChat.model,
+  configure: configureDeepSeek,
+}
+```
+
+Provider-specific configuration happens before model selection:
+
+```ts
+const deepseek = DeepSeek.configure({
+  endpoint: {
+    baseURL: "https://proxy.example.com/v1",
+  },
+  auth: Auth.bearer(apiKey),
+})
+
+const model = deepseek.model("deepseek-chat")
+```
+
+Final request call site stays boring:
+
+```ts
+const response =
+  yield *
+  LLM.generate(
+    LLM.request({
+      model: DeepSeek.model("deepseek-chat"),
+      prompt: "Hello.",
+    }),
+  )
+```
+
+HTTP versus WebSocket is represented as named route selectors, not as model or
+request overrides. Same protocol, different transport, different route:
+
+```ts
+OpenAI.responses("gpt-4o")
+OpenAI.responsesWebSocket("gpt-4o")
+```
+
+The client should not require a different public layer just because a selected
+route uses WebSocket. Use one `LLMClient.layer` with HTTP and WebSocket runtime
+capabilities available; routes that do not need WebSocket simply never touch it.
+If a WebSocket route is selected in an environment without WebSocket support,
+fail with a typed transport configuration error.
+
+Azure is a route specialization with auth/path/default changes plus input
+mapping. The public API configures the Azure resource once, then selects
+deployment ids with pure model selectors:
+
+```ts
+const azureProvider = ProviderID.make("azure")
+
+const azureResponses = openAIResponses.with({
+  id: "azure-openai-responses",
+  provider: azureProvider,
+  auth: Auth.envHeader("api-key", "AZURE_OPENAI_API_KEY"),
+})
+
+const configureAzure = (input: AzureConfig = {}) => {
+  const route = azureResponses.with({
+    endpoint: {
+      baseURL:
+        input.baseURL ??
+        Endpoint.envBaseURL(
+          "AZURE_RESOURCE_NAME",
+          (resourceName) => `https://${resourceName}.openai.azure.com/openai/v1`,
+        ),
+      query: { "api-version": input.apiVersion ?? "v1" },
+    },
+    auth: input.apiKey ? Auth.header("api-key", input.apiKey) : Auth.envHeader("api-key", "AZURE_OPENAI_API_KEY"),
+  })
+
+  return {
+    id: azureProvider,
+    model: route.model,
+    responses: route.model,
+    configure: configureAzure,
+  }
+}
+
+export const Azure = configureAzure()
+
+const azure = Azure.configure({
+  resourceName: "my-resource",
+  apiVersion: "v1",
+})
+
+const model = azure.responses("my-deployment")
+```
+
+Default provider facades are only valid when required configuration has a lazy
+default source. `Azure.responses("my-deployment")` can be valid if endpoint
+resolution reads `AZURE_RESOURCE_NAME` lazily and fails with a typed
+configuration error when missing. If a provider has no sensible lazy default,
+do not expose a default model selector; expose only a configured entrypoint.
+
+Cloudflare AI Gateway and Workers AI are separate product facades because their
+configuration surfaces differ. Do not make a root `Cloudflare.configure(...)`
+pretend there is one coherent Cloudflare provider configuration:
+
+```ts
+const cloudflareProvider = ProviderID.make("cloudflare-ai-gateway")
+
+const cloudflareOpenAIChat = openAIChat.with({
+  id: "cloudflare-ai-gateway-openai-chat",
+  provider: cloudflareProvider,
+  auth: Auth.bearerHeader("cf-aig-authorization").andThen(Auth.bearer()),
+})
+
+const configureCloudflareAIGateway = (input: CloudflareAIGatewayConfig) => {
+  const route = cloudflareOpenAIChat.with({
+    endpoint: {
+      baseURL: `https://gateway.ai.cloudflare.com/v1/${input.accountId}/${input.gatewayId}/openai`,
+    },
+    auth: Auth.bearerHeader("cf-aig-authorization", input.gatewayApiKey).andThen(Auth.bearer(input.apiKey)),
+  })
+
+  return {
+    id: cloudflareProvider,
+    model: (modelID: string) => route.model({ id: modelID }),
+    configure: configureCloudflareAIGateway,
+  }
+}
+
+export const CloudflareAIGateway = {
+  id: cloudflareProvider,
+  configure: configureCloudflareAIGateway,
+}
+
+const gateway = CloudflareAIGateway.configure({
+  accountId: "account",
+  gatewayId: "gateway",
+  gatewayApiKey,
+  apiKey,
+})
+
+const model = gateway.model("openai/gpt-4o")
+```
+
+If a Cloudflare product gains a full lazy env default, it can expose a direct
+selector too. Until then, omitting `CloudflareAIGateway.model(...)` makes missing
+account/gateway configuration unrepresentable.
+
+opencode's dynamic runtime should construct executable models at its app
+boundary instead of exposing a giant unstructured public model constructor or a
+generic dynamic resolver:
+
+```ts
+const model =
+  providerID === "azure"
+    ? Azure.configure(resolvedAzureConfig).responses(apiModelID)
+    : endpoint.websocket
+      ? OpenAI.responsesWebSocket(apiModelID)
+      : OpenAI.responses(apiModelID)
+```
+
+That boundary can branch on durable config/catalog metadata and call typed
+provider APIs directly. Transport selection belongs there too: map metadata like
+`endpoint.websocket` to `OpenAI.responsesWebSocket(apiModelID)`; otherwise use
+the normal `OpenAI.responses(apiModelID)` route. The client runtime only executes
+the route carried by the model.
+
+## Competitive Shape
+
+This follows the strongest parts of adjacent libraries:
+
+- AI SDK: configured provider instances expose provider-specific model methods.
+- Effect AI: executable models carry provider requirements and can be resolved by
+  an app boundary.
+- LiteLLM/opencode config: dynamic `providerID/modelID` branching belongs at the
+  app boundary, not in the typed public provider API or a global runtime
+  resolver.
+- LangChain/LlamaIndex: constructor-style config plus model id is convenient,
+  but we avoid making model selection also configure endpoint/auth.
+
+The chosen split is:
+
+```txt
+Route = execution mechanics
+Provider facade = configured route group
+Model = selected executable model carrying route value
+App boundary = explicit durable-config -> typed-provider call
+```
+
+## What This Removes
+
+- No `Provider.make(...)` as a core abstraction.
+- No `Provider.make(...)` wrapper just to bind an id to model functions. Use a
+  branded provider id constant and a plain exported provider facade.
+- No `Deployment.define(...)` unless future examples force it.
+- No global route registry as the normal execution path.
+- No import side effects required before a model can execute.
+- No duplicate `provider.id` object when selected models already carry provider
+  id.
+- No `model(id, overrides)` escape hatch. Model selection takes the model id;
+  endpoint/auth/deployment customization happens by configuring the route first.
+- No transport override on model/request. HTTP SSE versus WebSocket is a named
+  route selector such as `responses` versus `responsesWebSocket`.
+- No separate public `LLMClient.layerWithWebSocket`. The runtime should expose one
+  client layer with the available transport capabilities.
+- No executable `ModelRef`. The executable handle is `Model`; durable model
+  identity stays separate and cannot execute on its own.
+
+## Implementation Todo
+
+- [x] Replace the current executable `ModelRef` with `Model`.
+- [x] Change `Model.route` to carry a route value, not a `RouteID` string.
+- [ ] Keep a separate durable model identity type for persisted/session/catalog
+      data, likely `{ providerID, modelID }`, and make it clear that it cannot
+      execute without resolver context.
+- [x] Change route model selectors so `route.model(id)` returns an executable
+      model with the route value attached, not a globally registered route id.
+- [x] Remove the standalone `Route.model(route, defaults, mapInput)` helper;
+      configured route instances own model selection.
+- [x] Remove endpoint/auth escape hatches from route model selection; callers must
+      configure endpoint/auth through `route.with(...)` or provider facades before
+      calling `.model(...)`.
+- [x] Remove request-shaping defaults from `Model`; selected models now carry only
+      id, provider, and configured route while defaults live on routes or requests.
+- [x] Rework `LLMClient.prepare` / `stream` / `generate` to read
+      `request.model.route` directly instead of calling `registeredRoute(...)`.
+- [x] Remove `Route.make(...)` global registration from the normal execution
+      path; keep route ids only as diagnostics/provider API labels.
+- [x] Model endpoint as `{ baseURL, path, query }` on routes, then remove the
+      current split where host/query live on the model and path lives in route
+      transport setup.
+- [x] Define `Route.with(...)` with explicit patch semantics for endpoint merge,
+      query merge, header merge, auth replacement, and optional diagnostic id.
+- [x] Make unconfigured transports reusable constants such as
+      `HttpTransport.sseJson`; keep transport functions only for configured/fresh
+      state construction.
+- [x] Collapse the public WebSocket runtime split so one `LLMClient.layer`
+      exposes available transport capabilities and selected routes fail with typed
+      transport config errors when a required capability is missing.
+- [x] Convert OpenAI provider APIs to provider-facade shape:
+      `OpenAI.configure(config).responses(id)`, `.chat(id)`, and
+      `.responsesWebSocket(id)`.
+- [x] Convert Azure to a configured facade where resource/base URL/api version
+      setup happens before selecting deployment ids.
+- [x] Split Cloudflare products into separate facades such as
+      `CloudflareAIGateway` and `CloudflareWorkersAI`; do not expose a shared root
+      config surface unless one product actually exists.
+- [x] Migrate remaining built-in provider facades one at a time so configuration
+      happens before model selection and selectors accept only ids:
+      xAI, GitHub Copilot, OpenRouter, OpenAI-compatible families, Anthropic,
+      Google/Gemini, and Amazon Bedrock now use configured facades such as
+      `Provider.configure(options).model(id)` with named selectors where needed.
+- [ ] Decide whether a tiny `Provider.define(...)` helper is warranted after two
+      or three provider conversions; start with plain objects if duplication is not
+      yet painful.
+- [x] Update `packages/opencode/src/session/llm/native-request.ts` to construct
+      executable models at the session boundary with explicit provider facade
+      calls, mapping catalog metadata such as `endpoint.websocket` to the correct
+      named route selector.
+- [ ] Update tests so direct route/provider tests assert route values are carried
+      by executable models, and opencode/native tests assert boundary-based route
+      selection.
+- [ ] Remove compatibility exports or stale docs only after internal call sites
+      are migrated; do not keep duplicate constructor paths without an external
+      compatibility need.
+
+## Open Questions
+
+- Default facades with required setup: should providers like Azure and Bedrock
+  expose default model selectors only when all required setup has lazy env or
+  credential-chain defaults? If not, omit the default selector so missing config
+  is impossible at the type/API level.
+- Lazy endpoint/auth values: should `Endpoint.envBaseURL(...)` and env-backed
+  auth produce typed configuration/authentication errors at compile/prepare time
+  or only when executing the transport?
+- `Route.with(...)` clearing semantics: endpoint/query/header patches merge by
+  default, but what is the explicit way to remove an inherited value?
+- Provider facade helper: keep plain objects until duplication hurts, or add a
+  tiny `Provider.define(...)` immediately to enforce shape and method projection?
+- Auth shape: should auth stay as today's composable `Auth`, or split into an
+  auth placement/strategy and credential sources?
+- Naming: is `baseURL` still the right endpoint field name, or should it be
+  `origin` / `urlPrefix` to clarify that route `path` is appended?
--- a/packages/llm/example/tutorial.ts
+++ b/packages/llm/example/tutorial.ts
@@ -1,6 +1,6 @@
 import { Config, Effect, Formatter, Layer, Schema, Stream } from "effect"
-import { LLM, LLMClient, Provider, ProviderID, Tool, type ProviderModelOptions } from "@opencode-ai/llm"
-import { Route, Auth, Endpoint, Framing, Protocol, RequestExecutor } from "@opencode-ai/llm/route"
+import { LLM, LLMClient, ProviderID, Tool } from "@opencode-ai/llm"
+import { Route, Auth, Endpoint, Framing, Protocol, RequestExecutor, WebSocketExecutor } from "@opencode-ai/llm/route"
 import { OpenAI } from "@opencode-ai/llm/providers"

 /**
@@ -18,18 +18,18 @@ const apiKey = Config.redacted("OPENAI_API_KEY")

 // 1. Pick a model. The provider helper records provider identity, protocol
 // choice, capabilities, deployment options, authentication, and defaults.
-const model = OpenAI.model("gpt-4o-mini", {
+const model = OpenAI.configure({
  apiKey,
  generation: { maxTokens: 160 },
  providerOptions: {
    openai: { store: false },
  },
-})
+}).model("gpt-4o-mini")

 // 2. Build a provider-neutral request. This is useful when reusing one request
 // across generate and stream examples.
 //
-// Options can live on both the model and the request:
+// Options can live on both the configured route/provider facade and the request:
 //
 //   - `generation`: common controls such as max tokens, temperature, topP/topK,
 //     penalties, seed, and stop sequences.
@@ -39,7 +39,7 @@ const model = OpenAI.model("gpt-4o-mini", {
 //   - `http`: last-resort serializable overlays for final request body, headers,
 //     and query params. Prefer typed `providerOptions` when a field is stable.
 //
-// Model options are defaults. Request options override them for this call.
+// Route/provider options are defaults. Request options override them for this call.
 const request = LLM.request({
  model,
  system: "You are concise and practical.",
@@ -193,19 +193,22 @@ const FakeProtocol = Protocol.make<FakeBody, string, string, void>({
 // axes that the protocol deliberately does not know: URL, auth, and framing.
 const FakeAdapter = Route.make({
  id: "fake-echo",
+  provider: "fake-echo",
  protocol: FakeProtocol,
-  endpoint: Endpoint.path("/v1/echo"),
+  endpoint: Endpoint.path("/v1/echo", { baseURL: "https://fake.local" }),
  auth: Auth.passthrough,
  framing: Framing.sse,
 })

-// A provider module exports a Provider definition. The default `model` helper
-// sets provider identity, protocol id, and the route id resolved by the registry.
-const fakeEchoModel = Route.model(FakeAdapter, { provider: "fake-echo", baseURL: "https://fake.local" })
-const FakeEcho = Provider.make({
+// A provider module exports a configured facade. Configuration happens before
+// model selection; model selectors accept ids only.
+const FakeEcho = {
  id: ProviderID.make("fake-echo"),
-  model: (id: string, options: ProviderModelOptions = {}) => fakeEchoModel({ id, ...options }),
-})
+  configure: () => ({
+    id: ProviderID.make("fake-echo"),
+    model: (id: string) => FakeAdapter.model({ id }),
+  }),
+}

 // `LLMClient.prepare` is the lower-level inspection hook: it compiles through
 // body conversion, validation, endpoint, auth, and HTTP construction without
@@ -213,7 +216,7 @@ const FakeEcho = Provider.make({
 const inspectFakeProvider = Effect.gen(function* () {
  const prepared = yield* LLMClient.prepare(
    LLM.request({
-      model: FakeEcho.model("tiny-echo"),
+      model: FakeEcho.configure().model("tiny-echo"),
      prompt: "Show me the provider pipeline.",
    }),
  )
@@ -227,7 +230,8 @@ const inspectFakeProvider = Effect.gen(function* () {
 // enabled at a time so the tutorial can demonstrate generate, prepare, stream,
 // or tool-loop behavior without spending tokens on every example.
 const requestExecutorLayer = RequestExecutor.defaultLayer
-const llmClientLayer = LLMClient.layer.pipe(Layer.provide(requestExecutorLayer))
+const llmDeps = Layer.mergeAll(requestExecutorLayer, WebSocketExecutor.layer)
+const llmClientLayer = LLMClient.layer.pipe(Layer.provide(llmDeps))

 const program = Effect.gen(function* () {
  // yield* generateOnce
@@ -237,6 +241,6 @@ const program = Effect.gen(function* () {
  // yield* generateStructuredObject
  // yield* generateDynamicObject.pipe(Effect.andThen((response) => Effect.sync(() => console.log(response.object))))
  yield* streamWithTools
-}).pipe(Effect.provide(Layer.mergeAll(requestExecutorLayer, llmClientLayer)))
+}).pipe(Effect.provide(Layer.mergeAll(llmDeps, llmClientLayer)))

 Effect.runPromise(program)
--- a/packages/llm/src/cache-policy.ts
+++ b/packages/llm/src/cache-policy.ts
@@ -97,7 +97,7 @@ const markMessages = (
 }

 export const applyCachePolicy = (request: LLMRequest): LLMRequest => {
-  if (!RESPECTS_INLINE_HINTS.has(request.model.route)) return request
+  if (!RESPECTS_INLINE_HINTS.has(request.model.route.id)) return request
  const policy = resolve(request.cache)
  if (!policy.tools && !policy.system && !policy.messages) return request

--- a/packages/llm/src/index.ts
+++ b/packages/llm/src/index.ts
@@ -1,4 +1,4 @@
-export { LLMClient, modelLimits, modelRef } from "./route/client"
+export { LLMClient } from "./route/client"
 export { Auth } from "./route/auth"
 export { Provider } from "./provider"
 export type {
@@ -6,7 +6,6 @@ export type {
  RouteRoutedModelInput,
  Interface as LLMClientShape,
  Service as LLMClientService,
-  ModelRefInput,
 } from "./route/client"
 export * from "./schema"
 export { Tool, ToolFailure, toDefinitions, tool } from "./tool"
--- a/packages/llm/src/llm.ts
+++ b/packages/llm/src/llm.ts
@@ -1,5 +1,5 @@
 import { Effect, JsonSchema, Schema } from "effect"
-import { LLMClient, modelLimits, modelRef, type ModelRefInput } from "./route/client"
+import { LLMClient } from "./route/client"
 import {
  GenerationOptions,
  HttpOptions,
@@ -9,6 +9,7 @@ import {
  LLMRequest,
  LLMResponse,
  Message,
+  type ModelInput as SchemaModelInput,
  SystemPart,
  ToolChoice,
  ToolDefinition,
@@ -18,7 +19,7 @@ import {
 } from "./schema"
 import { make as makeTool, type ToolSchema } from "./tool"

-export type ModelInput = ModelRefInput
+export type ModelInput = SchemaModelInput

 export type MessageInput = Message.Input

@@ -42,10 +43,6 @@ export type RequestInput = Omit<
  readonly http?: HttpOptions.Input
 }

-export const limits = modelLimits
-
-export const model = modelRef
-
 export const generate = LLMClient.generate

 export const stream = LLMClient.stream
--- a/packages/llm/src/protocols/anthropic-messages.ts
+++ b/packages/llm/src/protocols/anthropic-messages.ts
@@ -386,7 +386,7 @@ const fromRequest = Effect.fn("AnthropicMessages.fromRequest")(function* (reques
    tools,
    tool_choice: toolChoice,
    stream: true as const,
-    max_tokens: generation?.maxTokens ?? request.model.limits.output ?? 4096,
+    max_tokens: generation?.maxTokens ?? request.model.route.defaults.limits?.output ?? 4096,
    temperature: generation?.temperature,
    top_p: generation?.topP,
    top_k: generation?.topK,
@@ -452,8 +452,8 @@ const mergeUsage = (left: Usage | undefined, right: Usage | undefined) => {
    totalTokens: ProviderShared.totalTokens(inputTokens, outputTokens, undefined),
    providerMetadata: {
      anthropic: {
-        ...(left.providerMetadata?.["anthropic"] ?? {}),
-        ...(right.providerMetadata?.["anthropic"] ?? {}),
+        ...left.providerMetadata?.["anthropic"],
+        ...right.providerMetadata?.["anthropic"],
      },
    },
  })
@@ -673,19 +673,12 @@ export const protocol = Protocol.make({

 export const route = Route.make({
  id: ADAPTER,
+  provider: "anthropic",
  protocol,
-  endpoint: Endpoint.path(PATH),
-  auth: Auth.apiKeyHeader("x-api-key"),
+  endpoint: Endpoint.path(PATH, { baseURL: DEFAULT_BASE_URL }),
+  auth: Auth.none,
  framing: Framing.sse,
  headers: () => ({ "anthropic-version": "2023-06-01" }),
 })

-// =============================================================================
-// Model Helper
-// =============================================================================
-export const model = Route.model(route, {
-  provider: "anthropic",
-  baseURL: DEFAULT_BASE_URL,
-})
-
 export * as AnthropicMessages from "./anthropic-messages"
--- a/packages/llm/src/protocols/bedrock-converse.ts
+++ b/packages/llm/src/protocols/bedrock-converse.ts
@@ -1,5 +1,5 @@
 import { Effect, Schema } from "effect"
-import { Route, type RouteModelInput } from "../route/client"
+import { Route } from "../route/client"
 import { Endpoint } from "../route/endpoint"
 import { Protocol } from "../route/protocol"
 import {
@@ -14,7 +14,7 @@ import {
 } from "../schema"
 import { BedrockEventStream } from "./bedrock-event-stream"
 import { JsonObject, optionalArray, ProviderShared } from "./shared"
-import { BedrockAuth, type Credentials as BedrockCredentials } from "./utils/bedrock-auth"
+import { BedrockAuth } from "./utils/bedrock-auth"
 import { BedrockCache } from "./utils/bedrock-cache"
 import { BedrockMedia } from "./utils/bedrock-media"
 import { Lifecycle } from "./utils/lifecycle"
@@ -24,23 +24,6 @@ const ADAPTER = "bedrock-converse"

 export type { Credentials as BedrockCredentials } from "./utils/bedrock-auth"

-// =============================================================================
-// Public Model Input
-// =============================================================================
-export type BedrockConverseModelInput = RouteModelInput & {
-  /**
-   * Bearer API key (Bedrock's newer API key auth). Sets the `Authorization`
-   * header and bypasses SigV4 signing. Mutually exclusive with `credentials`.
-   */
-  readonly apiKey?: string
-  /**
-   * AWS credentials for SigV4 signing. The route signs each request at
-   * `toHttp` time using `aws4fetch`. Mutually exclusive with `apiKey`.
-   */
-  readonly credentials?: BedrockCredentials
-  readonly headers?: Record<string, string>
-}
-
 // =============================================================================
 // Request Body Schema
 // =============================================================================
@@ -61,6 +44,7 @@ type BedrockToolUseBlock = Schema.Schema.Type<typeof BedrockToolUseBlock>
 const BedrockToolResultContentItem = Schema.Union([
  Schema.Struct({ text: Schema.String }),
  Schema.Struct({ json: Schema.Unknown }),
+  BedrockMedia.ImageBlock,
 ])

 const BedrockToolResultBlock = Schema.Struct({
@@ -261,15 +245,33 @@ const lowerToolCall = (part: ToolCallPart): BedrockToolUseBlock => ({
  },
 })

-const lowerToolResult = (part: ToolResultPart): BedrockToolResultBlock => ({
-  toolResult: {
-    toolUseId: part.id,
-    content:
-      part.result.type === "text" || part.result.type === "error"
-        ? [{ text: ProviderShared.toolResultText(part) }]
-        : [{ json: part.result.value }],
-    status: part.result.type === "error" ? "error" : "success",
-  },
+const lowerToolResultContent = Effect.fn("BedrockConverse.lowerToolResultContent")(function* (part: ToolResultPart) {
+  if (part.result.type === "text" || part.result.type === "error")
+    return [{ text: ProviderShared.toolResultText(part) }]
+  if (part.result.type === "json") return [{ json: part.result.value }]
+
+  const content: Array<Schema.Schema.Type<typeof BedrockToolResultContentItem>> = []
+  for (const item of part.result.value) {
+    if (item.type === "text") {
+      content.push({ text: item.text })
+      continue
+    }
+    const media = yield* BedrockMedia.lower(item)
+    if (!("image" in media))
+      return yield* ProviderShared.invalidRequest("Bedrock Converse only supports image media in tool results")
+    content.push(media)
+  }
+  return content
+})
+
+const lowerToolResult = Effect.fn("BedrockConverse.lowerToolResult")(function* (part: ToolResultPart) {
+  return {
+    toolResult: {
+      toolUseId: part.id,
+      content: yield* lowerToolResultContent(part),
+      status: part.result.type === "error" ? "error" : "success",
+    },
+  } satisfies BedrockToolResultBlock
 })

 const lowerMessages = Effect.fn("BedrockConverse.lowerMessages")(function* (
@@ -331,7 +333,7 @@ const lowerMessages = Effect.fn("BedrockConverse.lowerMessages")(function* (
    for (const part of message.content) {
      if (!ProviderShared.supportsContent(part, ["tool-result"]))
        return yield* ProviderShared.unsupportedContent("Bedrock Converse", "tool", ["tool-result"])
-      content.push(lowerToolResult(part))
+      content.push(yield* lowerToolResult(part))
      const cachePoint = BedrockCache.block(breakpoints, part.cache)
      if (cachePoint) content.push(cachePoint)
    }
@@ -597,11 +599,11 @@ export const protocol = Protocol.make({

 export const route = Route.make({
  id: ADAPTER,
+  provider: "bedrock",
  protocol,
-  // Bedrock's URL embeds the region in the host (set on `model.baseURL` by
-  // the provider helper from credentials) and the validated modelId in the
-  // path. We read the validated body so the URL matches the body that gets
-  // signed.
+  // Bedrock's URL embeds the region in the route endpoint host and the
+  // validated modelId in the path. We read the validated body so the URL
+  // matches the body that gets signed.
  endpoint: Endpoint.path<BedrockConverseBody>(
    ({ body }) => `/model/${encodeURIComponent(body.modelId)}/converse-stream`,
  ),
@@ -609,26 +611,6 @@ export const route = Route.make({
  framing,
 })

-export const nativeCredentials = BedrockAuth.nativeCredentials
-
-const bedrockModel = Route.model(
-  route,
-  {
-    provider: "bedrock",
-  },
-  {
-    mapInput: (input: BedrockConverseModelInput) => {
-      const { credentials, ...rest } = input
-      const region = credentials?.region ?? "us-east-1"
-      return {
-        ...rest,
-        baseURL: rest.baseURL ?? `https://bedrock-runtime.${region}.amazonaws.com`,
-        native: nativeCredentials(input.native, credentials),
-      }
-    },
-  },
-)
-
-export const model = bedrockModel
+export const sigV4Auth = BedrockAuth.sigV4

 export * as BedrockConverse from "./bedrock-converse"
--- a/packages/llm/src/protocols/gemini.ts
+++ b/packages/llm/src/protocols/gemini.ts
@@ -404,19 +404,14 @@ export const protocol = Protocol.make({

 export const route = Route.make({
  id: ADAPTER,
+  provider: "google",
  protocol,
  // Gemini's path embeds the model id and pins SSE framing at the URL level.
-  endpoint: Endpoint.path(({ request }) => `/models/${request.model.id}:streamGenerateContent?alt=sse`),
-  auth: Auth.apiKeyHeader("x-goog-api-key"),
+  endpoint: Endpoint.path(({ request }) => `/models/${request.model.id}:streamGenerateContent?alt=sse`, {
+    baseURL: DEFAULT_BASE_URL,
+  }),
+  auth: Auth.none,
  framing: Framing.sse,
 })

-// =============================================================================
-// Model Helper
-// =============================================================================
-export const model = Route.model(route, {
-  provider: "google",
-  baseURL: DEFAULT_BASE_URL,
-})
-
 export * as Gemini from "./gemini"
--- a/packages/llm/src/protocols/openai-chat.ts
+++ b/packages/llm/src/protocols/openai-chat.ts
@@ -2,7 +2,6 @@ import { Array as Arr, Effect, Schema } from "effect"
 import { Route } from "../route/client"
 import { Auth } from "../route/auth"
 import { Endpoint } from "../route/endpoint"
-import { Framing } from "../route/framing"
 import { HttpTransport } from "../route/transport"
 import { Protocol } from "../route/protocol"
 import {
@@ -393,28 +392,15 @@ export const protocol = Protocol.make({
  },
 })

-const encodeBody = Schema.encodeSync(Schema.fromJsonString(OpenAIChatBody))
-
-export const httpTransport = HttpTransport.httpJson({
-  endpoint: Endpoint.path(PATH),
-  auth: Auth.bearer(),
-  framing: Framing.sse,
-  encodeBody,
-})
+export const httpTransport = HttpTransport.sseJson.with<OpenAIChatBody>()

 export const route = Route.make({
  id: ADAPTER,
  provider: "openai",
  protocol,
+  endpoint: Endpoint.path(PATH, { baseURL: DEFAULT_BASE_URL }),
+  auth: Auth.none,
  transport: httpTransport,
-  defaults: {
-    baseURL: DEFAULT_BASE_URL,
-  },
 })

-// =============================================================================
-// Model Helper
-// =============================================================================
-export const model = route.model
-
 export * as OpenAIChat from "./openai-chat"
--- a/packages/llm/src/protocols/openai-compatible-chat.ts
+++ b/packages/llm/src/protocols/openai-compatible-chat.ts
@@ -5,16 +5,14 @@ import * as OpenAIChat from "./openai-chat"

 const ADAPTER = "openai-compatible-chat"

-export type OpenAICompatibleChatModelInput = Omit<RouteRoutedModelInput, "baseURL"> & {
-  readonly baseURL: string
-}
+export type OpenAICompatibleChatModelInput = RouteRoutedModelInput

 /**
 * Route for non-OpenAI providers that expose an OpenAI Chat-compatible
 * `/chat/completions` endpoint. Reuses `OpenAIChat.protocol` end-to-end and
 * overrides only the route id so providers can be resolved per-family without
- * colliding with native OpenAI. The model carries the host on `baseURL`,
- * supplied by whichever profile/provider helper builds it.
+ * colliding with native OpenAI. Provider helpers configure the route endpoint
+ * before model selection.
 */
 export const route = Route.make({
  id: ADAPTER,
@@ -23,6 +21,4 @@ export const route = Route.make({
  framing: Framing.sse,
 })

-export const model = Route.model<OpenAICompatibleChatModelInput>(route)
-
 export * as OpenAICompatibleChat from "./openai-compatible-chat"
--- a/packages/llm/src/protocols/openai-responses.ts
+++ b/packages/llm/src/protocols/openai-responses.ts
@@ -2,11 +2,11 @@ import { Effect, Schema } from "effect"
 import { Route } from "../route/client"
 import { Auth } from "../route/auth"
 import { Endpoint } from "../route/endpoint"
-import { Framing } from "../route/framing"
 import { HttpTransport, WebSocketTransport } from "../route/transport"
 import { Protocol } from "../route/protocol"
 import {
  LLMEvent,
+  type MediaPart,
  Usage,
  type FinishReason,
  type LLMRequest,
@@ -31,6 +31,12 @@ const OpenAIResponsesInputText = Schema.Struct({
  type: Schema.tag("input_text"),
  text: Schema.String,
 })
+const OpenAIResponsesInputImage = Schema.Struct({
+  type: Schema.tag("input_image"),
+  image_url: Schema.String,
+})
+const OpenAIResponsesInputContent = Schema.Union([OpenAIResponsesInputText, OpenAIResponsesInputImage])
+type OpenAIResponsesInputContent = Schema.Schema.Type<typeof OpenAIResponsesInputContent>

 const OpenAIResponsesOutputText = Schema.Struct({
  type: Schema.tag("output_text"),
@@ -39,7 +45,7 @@ const OpenAIResponsesOutputText = Schema.Struct({

 const OpenAIResponsesInputItem = Schema.Union([
  Schema.Struct({ role: Schema.tag("system"), content: Schema.String }),
-  Schema.Struct({ role: Schema.tag("user"), content: Schema.Array(OpenAIResponsesInputText) }),
+  Schema.Struct({ role: Schema.tag("user"), content: Schema.Array(OpenAIResponsesInputContent) }),
  Schema.Struct({ role: Schema.tag("assistant"), content: Schema.Array(OpenAIResponsesOutputText) }),
  Schema.Struct({
    type: Schema.tag("function_call"),
@@ -151,12 +157,15 @@ const OpenAIResponsesEvent = Schema.Struct({
  item_id: Schema.optional(Schema.String),
  item: Schema.optional(OpenAIResponsesStreamItem),
  response: Schema.optional(
-    Schema.Struct({
-      id: Schema.optional(Schema.String),
-      service_tier: Schema.optional(Schema.String),
-      incomplete_details: optionalNull(Schema.Struct({ reason: Schema.String })),
-      usage: optionalNull(OpenAIResponsesUsage),
-    }),
+    Schema.StructWithRest(
+      Schema.Struct({
+        id: Schema.optional(Schema.String),
+        service_tier: optionalNull(Schema.String),
+        incomplete_details: optionalNull(Schema.Struct({ reason: Schema.String })),
+        usage: optionalNull(OpenAIResponsesUsage),
+      }),
+      [Schema.Record(Schema.String, Schema.Unknown)],
+    ),
  ),
  code: Schema.optional(Schema.String),
  message: Schema.optional(Schema.String),
@@ -196,6 +205,22 @@ const lowerToolCall = (part: ToolCallPart): OpenAIResponsesInputItem => ({
  arguments: ProviderShared.encodeJson(part.input),
 })

+const imageUrl = (part: MediaPart) =>
+  typeof part.data === "string" && part.data.startsWith("data:")
+    ? part.data
+    : `data:${part.mediaType};base64,${ProviderShared.mediaBytes(part)}`
+
+const lowerUserContent = Effect.fn("OpenAIResponses.lowerUserContent")(function* (
+  part: LLMRequest["messages"][number]["content"][number],
+) {
+  if (part.type === "text") return { type: "input_text" as const, text: part.text }
+  if (part.type === "media" && part.mediaType.startsWith("image/")) {
+    return { type: "input_image" as const, image_url: imageUrl(part) }
+  }
+  if (part.type === "media") return yield* invalid("OpenAI Responses user media content only supports images")
+  return yield* ProviderShared.unsupportedContent("OpenAI Responses", "user", ["text", "media"])
+})
+
 const lowerMessages = Effect.fn("OpenAIResponses.lowerMessages")(function* (request: LLMRequest) {
  const system: OpenAIResponsesInputItem[] =
    request.system.length === 0 ? [] : [{ role: "system", content: ProviderShared.joinText(request.system) }]
@@ -203,13 +228,7 @@ const lowerMessages = Effect.fn("OpenAIResponses.lowerMessages")(function* (requ

  for (const message of request.messages) {
    if (message.role === "user") {
-      const content: TextPart[] = []
-      for (const part of message.content) {
-        if (!ProviderShared.supportsContent(part, ["text"]))
-          return yield* ProviderShared.unsupportedContent("OpenAI Responses", "user", ["text"])
-        content.push(part)
-      }
-      input.push({ role: "user", content: content.map((part) => ({ type: "input_text", text: part.text })) })
+      input.push({ role: "user", content: yield* Effect.forEach(message.content, lowerUserContent) })
      continue
    }

@@ -536,27 +555,18 @@ export const protocol = Protocol.make({
  },
 })

-const encodeBody = Schema.encodeSync(Schema.fromJsonString(OpenAIResponsesBody))
-const transportBase = {
-  endpoint: Endpoint.path<OpenAIResponsesBody>(PATH),
-  auth: Auth.bearer(),
-  encodeBody,
-}
-const routeDefaults = {
-  baseURL: DEFAULT_BASE_URL,
-}
+const endpoint = Endpoint.path<OpenAIResponsesBody>(PATH, { baseURL: DEFAULT_BASE_URL })
+const auth = Auth.none

-export const httpTransport = HttpTransport.httpJson({
-  ...transportBase,
-  framing: Framing.sse,
-})
+export const httpTransport = HttpTransport.sseJson.with<OpenAIResponsesBody>()

 export const route = Route.make({
  id: ADAPTER,
  provider: "openai",
  protocol,
+  endpoint,
+  auth,
  transport: httpTransport,
-  defaults: routeDefaults,
 })

 const decodeWebSocketMessage = ProviderShared.validateWith(Schema.decodeUnknownEffect(OpenAIResponsesWebSocketMessage))
@@ -569,8 +579,10 @@ const webSocketMessage = (body: OpenAIResponsesBody | Record<string, unknown>) =
    return yield* decodeWebSocketMessage({ ...message, type: "response.create" })
  })

-export const webSocketTransport = WebSocketTransport.json({
-  ...transportBase,
+export const webSocketTransport = WebSocketTransport.jsonTransport.with<
+  OpenAIResponsesBody,
+  OpenAIResponsesWebSocketMessage
+>({
  toMessage: webSocketMessage,
  encodeMessage: encodeWebSocketMessage,
 })
@@ -579,15 +591,9 @@ export const webSocketRoute = Route.make({
  id: `${ADAPTER}-websocket`,
  provider: "openai",
  protocol,
+  endpoint,
+  auth,
  transport: webSocketTransport,
-  defaults: routeDefaults,
 })

-// =============================================================================
-// Model Helper
-// =============================================================================
-export const model = route.model
-
-export const webSocketModel = webSocketRoute.model
-
 export * as OpenAIResponses from "./openai-responses"
--- a/packages/llm/src/protocols/shared.ts
+++ b/packages/llm/src/protocols/shared.ts
@@ -11,6 +11,7 @@ import {
  type MediaPart,
  type ToolResultPart,
 } from "../schema"
+export { isRecord } from "../utils/record"

 export const Json = Schema.fromJsonString(Schema.Unknown)
 export const decodeJson = Schema.decodeUnknownSync(Json)
@@ -19,13 +20,6 @@ export const JsonObject = Schema.Record(Schema.String, Schema.Unknown)
 export const optionalArray = <const S extends Schema.Top>(schema: S) => Schema.optional(Schema.Array(schema))
 export const optionalNull = <const S extends Schema.Top>(schema: S) => Schema.optional(Schema.NullOr(schema))

-/**
- * Plain-record narrowing. Excludes arrays so routes checking nested JSON
- * Schema fragments don't accidentally treat a tuple as a key/value bag.
- */
-export const isRecord = (value: unknown): value is Record<string, unknown> =>
-  typeof value === "object" && value !== null && !Array.isArray(value)
-
 /**
 * Streaming tool-call accumulator. Adapters that build a tool call across
 * multiple `tool-input-delta` chunks store the partial JSON input string here
@@ -132,6 +126,7 @@ export const trimBaseUrl = (value: string) => value.replace(/\/+$/, "")

 export const toolResultText = (part: ToolResultPart) => {
  if (part.result.type === "text" || part.result.type === "error") return String(part.result.value)
+  if (part.result.type === "content") return encodeJson(part.result.value)
  return encodeJson(part.result.value)
 }

--- a/packages/llm/src/protocols/utils/bedrock-auth.ts
+++ b/packages/llm/src/protocols/utils/bedrock-auth.ts
@@ -1,15 +1,14 @@
 import { AwsV4Signer } from "aws4fetch"
-import { Effect, Option, Schema } from "effect"
+import { Effect } from "effect"
 import { Headers } from "effect/unstable/http"
 import { Auth, type AuthInput } from "../../route/auth"
-import type { LLMRequest } from "../../schema"
 import { ProviderShared } from "../shared"

 /**
- * AWS credentials for SigV4 signing. Bedrock also supports Bearer API key auth
- * via `model.apiKey`, which bypasses SigV4 signing. STS-vended credentials
- * should be refreshed by the consumer (rebuild the model) before they expire;
- * the route does not refresh.
+ * AWS credentials for SigV4 signing. Bedrock also supports Bearer API key auth,
+ * which provider facades configure as route auth instead of SigV4. STS-vended
+ * credentials should be refreshed by the consumer (rebuild the model) before
+ * they expire; the route does not refresh.
 */
 export interface Credentials {
  readonly region: string
@@ -18,32 +17,6 @@ export interface Credentials {
  readonly sessionToken?: string
 }

-const NativeCredentials = Schema.Struct({
-  accessKeyId: Schema.String,
-  secretAccessKey: Schema.String,
-  region: Schema.optional(Schema.String),
-  sessionToken: Schema.optional(Schema.String),
-})
-
-const decodeNativeCredentials = Schema.decodeUnknownOption(NativeCredentials)
-
-export const region = (request: LLMRequest) => {
-  const fromNative = request.model.native?.aws_region
-  if (typeof fromNative === "string" && fromNative !== "") return fromNative
-  return (
-    decodeNativeCredentials(request.model.native?.aws_credentials).pipe(
-      Option.map((credentials) => credentials.region),
-      Option.getOrUndefined,
-    ) ?? "us-east-1"
-  )
-}
-
-const credentialsFromInput = (request: LLMRequest): Credentials | undefined =>
-  decodeNativeCredentials(request.model.native?.aws_credentials).pipe(
-    Option.map((creds) => ({ ...creds, region: creds.region ?? region(request) })),
-    Option.getOrUndefined,
-  )
-
 const signRequest = (input: {
  readonly url: string
  readonly body: string
@@ -71,33 +44,27 @@ const signRequest = (input: {
      ),
  })

-/**
- * Bedrock auth. `model.apiKey` (Bedrock's newer Bearer API key auth) wins if
- * set; otherwise sign the exact JSON bytes with SigV4 using credentials from
- * `model.native.aws_credentials`.
- */
-export const auth = Auth.custom((input: AuthInput) => {
-  if (input.request.model.apiKey) return Auth.toEffect(Auth.bearer())(input)
-  return Effect.gen(function* () {
-    const credentials = credentialsFromInput(input.request)
-    if (!credentials) {
-      return yield* ProviderShared.invalidRequest(
-        "Bedrock Converse requires either model.apiKey or AWS credentials in model.native.aws_credentials",
-      )
-    }
-    const headersForSigning = Headers.set(input.headers, "content-type", "application/json")
-    const signed = yield* signRequest({ url: input.url, body: input.body, headers: headersForSigning, credentials })
-    return Headers.setAll(headersForSigning, signed)
-  })
-})
-
-export const nativeCredentials = (native: Record<string, unknown> | undefined, credentials: Credentials | undefined) =>
-  credentials
-    ? {
-        ...native,
-        aws_credentials: credentials,
-        aws_region: credentials.region,
+/** Sign the exact JSON bytes with SigV4 using credentials configured on the route. */
+export const sigV4 = (credentials: Credentials | undefined) =>
+  Auth.custom((input: AuthInput) => {
+    return Effect.gen(function* () {
+      if (!credentials) {
+        return yield* ProviderShared.invalidRequest(
+          "Bedrock Converse requires either route bearer auth or AWS credentials configured on the route",
+        )
      }
-    : native
+      const headersForSigning = Headers.set(input.headers, "content-type", "application/json")
+      const signed = yield* signRequest({
+        url: input.url,
+        body: input.body,
+        headers: headersForSigning,
+        credentials,
+      })
+      return Headers.setAll(headersForSigning, signed)
+    })
+  })
+
+/** Bedrock route auth defaults to SigV4 and expects credentials from route configuration. */
+export const auth = sigV4(undefined)

 export * as BedrockAuth from "./bedrock-auth"
--- a/packages/llm/src/provider.ts
+++ b/packages/llm/src/provider.ts
@@ -1,14 +1,20 @@
-import type { RouteModelInput } from "./route/client"
-import type { ModelID, ModelRef, ProviderID } from "./schema"
+import type { RouteDefaultsInput } from "./route/client"
+import type { Model, ModelID, ProviderID } from "./schema"

-export type ModelOptions = Omit<RouteModelInput, "id">
+export type ModelOptions = RouteDefaultsInput

+/**
+ * Advanced structural provider definition helper. Built-in providers should
+ * prefer explicit `configure(options).model(id)` facades so deployment config is
+ * chosen before model selection. The optional `apis` map remains for external
+ * structural providers that expose multiple route selectors behind one provider.
+ */
 export type ModelFactory<Options extends ModelOptions = ModelOptions> = (
  id: string | ModelID,
  options?: Options,
-) => ModelRef
+) => Model

-type AnyModelFactory = (...args: never[]) => ModelRef
+type AnyModelFactory = (...args: never[]) => Model

 export interface Definition<Factory extends AnyModelFactory = ModelFactory> {
  readonly id: ProviderID
@@ -18,8 +24,8 @@ export interface Definition<Factory extends AnyModelFactory = ModelFactory> {

 type DefinitionShape = {
  readonly id: ProviderID
-  readonly model: (...args: never[]) => ModelRef
-  readonly apis?: Record<string, (...args: never[]) => ModelRef>
+  readonly model: (...args: never[]) => Model
+  readonly apis?: Record<string, (...args: never[]) => Model>
 }

 type NoExtraFields<Input, Shape> = Input & Record<Exclude<keyof Input, keyof Shape>, never>
--- a/packages/llm/src/providers/amazon-bedrock.ts
+++ b/packages/llm/src/providers/amazon-bedrock.ts
@@ -1,12 +1,12 @@
-import { Route, type RouteModelInput } from "../route/client"
-import { Provider } from "../provider"
+import type { RouteDefaultsInput } from "../route/client"
+import { Auth } from "../route/auth"
 import { ProviderID, type ModelID } from "../schema"
 import * as BedrockConverse from "../protocols/bedrock-converse"
 import type { BedrockCredentials } from "../protocols/bedrock-converse"

 export const id = ProviderID.make("amazon-bedrock")

-export type ModelOptions = Omit<RouteModelInput, "id" | "baseURL"> & {
+export type Config = RouteDefaultsInput & {
  readonly apiKey?: string
  readonly headers?: Record<string, string>
  readonly credentials?: BedrockCredentials
@@ -15,34 +15,29 @@ export type ModelOptions = Omit<RouteModelInput, "id" | "baseURL"> & {
  /** Override the computed `https://bedrock-runtime.<region>.amazonaws.com` URL. */
  readonly baseURL?: string
 }
-type ModelInput = ModelOptions & Pick<RouteModelInput, "id">
-
 export const routes = [BedrockConverse.route]

 const bedrockBaseURL = (region: string) => `https://bedrock-runtime.${region}.amazonaws.com`

-const converseModel = Route.model<ModelInput>(
-  BedrockConverse.route,
-  {
-    provider: "amazon-bedrock",
-  },
-  {
-    mapInput: (input) => {
-      const { credentials, region, baseURL, ...rest } = input
-      const resolvedRegion = region ?? credentials?.region ?? "us-east-1"
-      return {
-        ...rest,
-        baseURL: baseURL ?? bedrockBaseURL(resolvedRegion),
-        native: BedrockConverse.nativeCredentials(input.native, credentials),
-      }
-    },
-  },
-)
+const configuredRoute = (input: Config) => {
+  const { apiKey, credentials, region, baseURL, ...rest } = input
+  const resolvedRegion = region ?? credentials?.region ?? "us-east-1"
+  return BedrockConverse.route.with({
+    ...rest,
+    provider: id,
+    endpoint: { baseURL: baseURL ?? bedrockBaseURL(resolvedRegion) },
+    auth: apiKey === undefined ? BedrockConverse.sigV4Auth(credentials) : Auth.bearer(apiKey),
+  })
+}

-export const model = (modelID: string | ModelID, options: ModelOptions = {}) =>
-  converseModel({ ...options, id: modelID })
+export const configure = (input: Config = {}) => {
+  const route = configuredRoute(input)
+  return {
+    id,
+    model: (modelID: string | ModelID) => route.model({ id: modelID }),
+    configure,
+  }
+}

-export const provider = Provider.make({
-  id,
-  model,
-})
+export const provider = configure()
+export const model = provider.model
--- a/packages/llm/src/providers/anthropic.ts
+++ b/packages/llm/src/providers/anthropic.ts
@@ -1,5 +1,6 @@
-import type { RouteModelInput } from "../route/client"
-import { Provider } from "../provider"
+import type { RouteDefaultsInput } from "../route/client"
+import { Auth } from "../route/auth"
+import type { ProviderAuthOption } from "../route/auth-options"
 import { ProviderID, type ModelID } from "../schema"
 import * as AnthropicMessages from "../protocols/anthropic-messages"

@@ -7,12 +8,28 @@ export const id = ProviderID.make("anthropic")

 export const routes = [AnthropicMessages.route]

-export const model = (
-  id: string | ModelID,
-  options: Omit<RouteModelInput, "id" | "baseURL"> & { readonly baseURL?: string } = {},
-) => AnthropicMessages.model({ ...options, id })
+export type Config = RouteDefaultsInput & ProviderAuthOption<"optional"> & { readonly baseURL?: string }

-export const provider = Provider.make({
-  id,
-  model,
-})
+const auth = (options: ProviderAuthOption<"optional">) => {
+  if ("auth" in options && options.auth) return options.auth
+  return Auth.optional("apiKey" in options ? options.apiKey : undefined, "apiKey")
+    .orElse(Auth.config("ANTHROPIC_API_KEY"))
+    .pipe(Auth.header("x-api-key"))
+}
+
+const configuredRoute = (input: Config) => {
+  const { apiKey: _, auth: _auth, baseURL, ...rest } = input
+  return AnthropicMessages.route.with({ ...rest, endpoint: { baseURL }, auth: auth(input) })
+}
+
+export const configure = (input: Config = {}) => {
+  const route = configuredRoute(input)
+  return {
+    id,
+    model: (modelID: string | ModelID) => route.model({ id: modelID }),
+    configure,
+  }
+}
+
+export const provider = configure()
+export const model = provider.model
--- a/packages/llm/src/providers/azure.ts
+++ b/packages/llm/src/providers/azure.ts
@@ -1,83 +1,110 @@
 import { Auth } from "../route/auth"
 import { type AtLeastOne, type ProviderAuthOption } from "../route/auth-options"
-import { Route } from "../route/client"
-import type { ModelInput } from "../llm"
-import { Provider } from "../provider"
+import type { Route as RouteDef, RouteDefaultsInput } from "../route/client"
 import { ProviderID, type ModelID } from "../schema"
 import * as OpenAIChat from "../protocols/openai-chat"
 import * as OpenAIResponses from "../protocols/openai-responses"
 import { withOpenAIOptions, type OpenAIProviderOptionsInput } from "./openai-options"

 export const id = ProviderID.make("azure")
-const routeAuth = Auth.remove("authorization").andThen(Auth.apiKeyHeader("api-key"))
+const routeAuth = Auth.remove("authorization")

 // Azure needs the customer's resource URL; supply either `resourceName`
 // (helper builds the URL) or `baseURL` directly.
 type AzureURL = AtLeastOne<{ readonly resourceName: string; readonly baseURL: string }>

 export type ModelOptions = AzureURL &
-  Omit<ModelInput, "id" | "provider" | "route" | "apiKey" | "auth" | "baseURL"> &
+  RouteDefaultsInput &
  ProviderAuthOption<"optional"> & {
    readonly apiVersion?: string
+    readonly queryParams?: Record<string, string>
    readonly useCompletionUrls?: boolean
    readonly providerOptions?: OpenAIProviderOptionsInput
  }
-type AzureModelInput = ModelOptions & Pick<ModelInput, "id">
+export type Config = ModelOptions

 const resourceBaseURL = (resourceName: string) => `https://${resourceName.trim()}.openai.azure.com/openai/v1`

 const responsesRoute = OpenAIResponses.route.with({
  id: "azure-openai-responses",
  provider: id,
-  transport: OpenAIResponses.httpTransport.with({ auth: routeAuth }),
+  auth: routeAuth,
+  endpoint: {
+    query: { "api-version": "v1" },
+  },
 })

 const chatRoute = OpenAIChat.route.with({
  id: "azure-openai-chat",
  provider: id,
-  transport: OpenAIChat.httpTransport.with({ auth: routeAuth }),
+  auth: routeAuth,
+  endpoint: {
+    query: { "api-version": "v1" },
+  },
 })

 export const routes = [responsesRoute, chatRoute]

-const mapInput = (input: AzureModelInput) => {
-  const { apiKey: _, apiVersion, resourceName, useCompletionUrls, ...rest } = input
-  return {
-    ...withOpenAIOptions(input.id, rest),
-    auth:
-      "auth" in input && input.auth
-        ? input.auth
-        : Auth.remove("authorization").andThen(
-            Auth.optional("apiKey" in input ? input.apiKey : undefined, "apiKey")
-              .orElse(Auth.config("AZURE_OPENAI_API_KEY"))
-              .pipe(Auth.header("api-key")),
-          ),
-    // AtLeastOne guarantees at least one is set; baseURL wins if both are.
-    baseURL: rest.baseURL ?? resourceBaseURL(resourceName!),
-    queryParams: {
-      ...rest.queryParams,
-      "api-version": apiVersion ?? rest.queryParams?.["api-version"] ?? "v1",
+const defaults = (input: Config) => {
+  const {
+    apiKey: _,
+    apiVersion: _apiVersion,
+    resourceName: _resourceName,
+    useCompletionUrls: _useCompletionUrls,
+    baseURL: _baseURL,
+    queryParams: _queryParams,
+    ...rest
+  } = input
+  if ("auth" in rest) {
+    const { auth: _, ...withoutAuth } = rest
+    return withoutAuth
+  }
+  return rest
+}
+
+const auth = (input: Config) => {
+  if ("auth" in input && input.auth) return input.auth
+  return Auth.remove("authorization").andThen(
+    Auth.optional("apiKey" in input ? input.apiKey : undefined, "apiKey")
+      .orElse(Auth.config("AZURE_OPENAI_API_KEY"))
+      .pipe(Auth.header("api-key")),
+  )
+}
+
+const configuredRoute = <Body, Prepared>(route: RouteDef<Body, Prepared>, input: Config) =>
+  route.with({
+    auth: auth(input),
+    endpoint: {
+      // AtLeastOne guarantees at least one is set; baseURL wins if both are.
+      baseURL: input.baseURL ?? resourceBaseURL(input.resourceName!),
+      query: {
+        ...(input.apiVersion ? { "api-version": input.apiVersion } : {}),
+        ...input.queryParams,
+      },
    },
+  })
+
+export const configure = (input: Config) => {
+  const configuredResponsesRoute = configuredRoute(responsesRoute, input)
+  const configuredChatRoute = configuredRoute(chatRoute, input)
+  const modelDefaults = defaults(input)
+
+  const responses = (modelID: string | ModelID) =>
+    configuredResponsesRoute.with(withOpenAIOptions(modelID, modelDefaults)).model({ id: modelID })
+
+  const chat = (modelID: string | ModelID) =>
+    configuredChatRoute.with(withOpenAIOptions(modelID, modelDefaults)).model({ id: modelID })
+
+  return {
+    id,
+    model: (modelID: string | ModelID) => (input.useCompletionUrls === true ? chat(modelID) : responses(modelID)),
+    responses,
+    chat,
+    configure,
  }
 }

-const chatModel = Route.model<AzureModelInput>(chatRoute, {}, { mapInput })
-const responsesModel = Route.model<AzureModelInput>(responsesRoute, {}, { mapInput })
-
-export const responses = (modelID: string | ModelID, options: ModelOptions) =>
-  responsesModel({ ...options, id: modelID })
-
-export const chat = (modelID: string | ModelID, options: ModelOptions) => chatModel({ ...options, id: modelID })
-
-export const model = (modelID: string | ModelID, options: ModelOptions) => {
-  if (options.useCompletionUrls === true) return chat(modelID, options)
-  return responses(modelID, options)
-}
-
-export const provider = Provider.make({
+export const provider = {
  id,
-  model,
-  apis: { responses, chat },
-})
-
-export const apis = provider.apis
+  configure,
+}
--- a/packages/llm/src/providers/cloudflare.ts
+++ b/packages/llm/src/providers/cloudflare.ts
@@ -1,19 +1,16 @@
 import type { Config, Redacted } from "effect"
-import { type ModelInput } from "../llm"
-import { Provider } from "../provider"
 import * as OpenAICompatibleChat from "../protocols/openai-compatible-chat"
 import { Auth } from "../route/auth"
 import { AuthOptions, type AtLeastOne, type ProviderAuthOption } from "../route/auth-options"
-import { Route } from "../route/client"
+import type { RouteDefaultsInput } from "../route/client"
 import { ProviderID, type ModelID } from "../schema"

 export const aiGatewayID = ProviderID.make("cloudflare-ai-gateway")
 export const workersAIID = ProviderID.make("cloudflare-workers-ai")
-export const id = aiGatewayID
 export const aiGatewayAuthEnvVars = ["CLOUDFLARE_API_TOKEN", "CF_AIG_TOKEN"] as const
 export const workersAIAuthEnvVars = ["CLOUDFLARE_API_KEY", "CLOUDFLARE_WORKERS_AI_TOKEN"] as const

-type CloudflareSecret = string | Redacted.Redacted<string> | Config.Config<string | Redacted.Redacted<string>>
+type CloudflareSecret = string | Redacted.Redacted | Config.Config<string | Redacted.Redacted>

 type GatewayURL = AtLeastOne<{
  readonly accountId: string
@@ -23,32 +20,26 @@ type GatewayURL = AtLeastOne<{
 }

 export type AIGatewayOptions = GatewayURL &
-  Omit<ModelInput, "id" | "provider" | "route" | "baseURL" | "apiKey" | "auth"> &
+  RouteDefaultsInput &
  ProviderAuthOption<"optional"> & {
    /** Cloudflare AI Gateway authentication token. Sent as `cf-aig-authorization`. */
    readonly gatewayApiKey?: CloudflareSecret
  }

-type AIGatewayInput = AIGatewayOptions & Pick<ModelInput, "id">
-
 type WorkersAIURL = AtLeastOne<{
  readonly accountId: string
  readonly baseURL: string
 }>

-export type WorkersAIOptions = WorkersAIURL &
-  Omit<ModelInput, "id" | "provider" | "route" | "baseURL" | "apiKey" | "auth"> &
-  ProviderAuthOption<"optional">
-
-type WorkersAIInput = WorkersAIOptions & Pick<ModelInput, "id">
+export type WorkersAIOptions = WorkersAIURL & RouteDefaultsInput & ProviderAuthOption<"optional">

 export const aiGatewayBaseURL = (input: GatewayURL) => {
  if (input.baseURL) return input.baseURL
-  if (!input.accountId) throw new Error("Cloudflare.aiGateway requires accountId unless baseURL is supplied")
+  if (!input.accountId) throw new Error("CloudflareAIGateway.configure requires accountId unless baseURL is supplied")
  return `https://gateway.ai.cloudflare.com/v1/${encodeURIComponent(input.accountId)}/${encodeURIComponent(input.gatewayId?.trim() || "default")}/compat`
 }

-const aiGatewayAuth = (input: AIGatewayInput) => {
+const aiGatewayAuth = (input: AIGatewayOptions) => {
  if ("auth" in input && input.auth) return input.auth
  const gateway = Auth.optional(input.gatewayApiKey, "gatewayApiKey")
    .orElse(Auth.config("CLOUDFLARE_API_TOKEN"))
@@ -61,11 +52,11 @@ const aiGatewayAuth = (input: AIGatewayInput) => {

 export const workersAIBaseURL = (input: WorkersAIURL) => {
  if (input.baseURL) return input.baseURL
-  if (!input.accountId) throw new Error("Cloudflare.workersAI requires accountId unless baseURL is supplied")
+  if (!input.accountId) throw new Error("CloudflareWorkersAI.configure requires accountId unless baseURL is supplied")
  return `https://api.cloudflare.com/client/v4/accounts/${encodeURIComponent(input.accountId)}/ai/v1`
 }

-const workersAIAuth = (input: WorkersAIInput) => {
+const workersAIAuth = (input: WorkersAIOptions) => {
  return AuthOptions.bearer(input, workersAIAuthEnvVars)
 }

@@ -81,59 +72,56 @@ export const workersAIRoute = OpenAICompatibleChat.route.with({

 export const routes = [aiGatewayRoute, workersAIRoute]

-const aiGatewayModel = Route.model<AIGatewayInput>(
-  aiGatewayRoute,
-  {
-    provider: id,
-  },
-  {
-    mapInput: (input) => {
-      const {
-        accountId: _accountId,
-        gatewayId: _gatewayId,
-        apiKey: _apiKey,
-        gatewayApiKey: _gatewayApiKey,
-        auth: _auth,
-        ...rest
-      } = input
-      return {
-        ...rest,
-        auth: aiGatewayAuth(input),
-        baseURL: aiGatewayBaseURL(input),
-      }
-    },
-  },
-)
+const aiGatewayDefaults = (options: AIGatewayOptions) => {
+  const {
+    accountId: _accountId,
+    gatewayId: _gatewayId,
+    apiKey: _apiKey,
+    gatewayApiKey: _gatewayApiKey,
+    baseURL: _baseURL,
+    auth: _auth,
+    ...rest
+  } = options
+  return rest
+}

-const workersAIModel = Route.model<WorkersAIInput>(
-  workersAIRoute,
-  {
-    provider: workersAIID,
-  },
-  {
-    mapInput: (input) => {
-      const { accountId: _accountId, apiKey: _apiKey, auth: _auth, ...rest } = input
-      return {
-        ...rest,
-        auth: workersAIAuth(input),
-        baseURL: workersAIBaseURL(input),
-      }
-    },
-  },
-)
+const workersAIDefaults = (options: WorkersAIOptions) => {
+  const { accountId: _accountId, apiKey: _apiKey, auth: _auth, baseURL: _baseURL, ...rest } = options
+  return rest
+}

-export const aiGateway = (modelID: string | ModelID, options: AIGatewayOptions) =>
-  aiGatewayModel({ ...options, id: modelID })
+const configureAIGateway = (options: AIGatewayOptions) => {
+  const route = aiGatewayRoute.with({
+    ...aiGatewayDefaults(options),
+    endpoint: { baseURL: aiGatewayBaseURL(options) },
+    auth: aiGatewayAuth(options),
+  })
+  return {
+    id: aiGatewayID,
+    model: (modelID: string | ModelID) => route.model({ id: modelID }),
+    configure: configureAIGateway,
+  }
+}

-export const workersAI = (modelID: string | ModelID, options: WorkersAIOptions) =>
-  workersAIModel({ ...options, id: modelID })
+const configureWorkersAI = (options: WorkersAIOptions) => {
+  const route = workersAIRoute.with({
+    ...workersAIDefaults(options),
+    endpoint: { baseURL: workersAIBaseURL(options) },
+    auth: workersAIAuth(options),
+  })
+  return {
+    id: workersAIID,
+    model: (modelID: string | ModelID) => route.model({ id: modelID }),
+    configure: configureWorkersAI,
+  }
+}

-export const model = aiGateway
+export const CloudflareAIGateway = {
+  id: aiGatewayID,
+  configure: configureAIGateway,
+}

-export const provider = Provider.make({
-  id,
-  model,
-  apis: { aiGateway, workersAI },
-})
-
-export const apis = provider.apis
+export const CloudflareWorkersAI = {
+  id: workersAIID,
+  configure: configureWorkersAI,
+}
--- a/packages/llm/src/providers/github-copilot.ts
+++ b/packages/llm/src/providers/github-copilot.ts
@@ -1,6 +1,5 @@
-import { Route } from "../route/client"
-import type { ModelInput } from "../llm"
-import { Provider } from "../provider"
+import { AuthOptions, type ProviderAuthOption } from "../route/auth-options"
+import type { RouteDefaultsInput } from "../route/client"
 import { ProviderID, type ModelID } from "../schema"
 import * as OpenAIChat from "../protocols/openai-chat"
 import * as OpenAIResponses from "../protocols/openai-responses"
@@ -10,10 +9,11 @@ export const id = ProviderID.make("github-copilot")

 // GitHub Copilot has no canonical public URL — callers (opencode, etc.) must
 // supply `baseURL` explicitly.
-export type ModelOptions = Omit<ModelInput, "id" | "provider" | "route"> & {
-  readonly providerOptions?: OpenAIProviderOptionsInput
-}
-type CopilotModelInput = ModelOptions & Pick<ModelInput, "id">
+export type ModelOptions = Omit<RouteDefaultsInput, "providerOptions"> &
+  ProviderAuthOption<"optional"> & {
+    readonly baseURL: string
+    readonly providerOptions?: OpenAIProviderOptionsInput
+  }

 export const shouldUseResponsesApi = (modelID: string | ModelID) => {
  const model = String(modelID)
@@ -24,25 +24,43 @@ export const shouldUseResponsesApi = (modelID: string | ModelID) => {

 export const routes = [OpenAIResponses.route, OpenAIChat.route]

-const mapInput = (input: CopilotModelInput) => withOpenAIOptions(input.id, input)
+const chatRoute = OpenAIChat.route.with({ provider: id })
+const responsesRoute = OpenAIResponses.route.with({ provider: id })

-const chatModel = Route.model<CopilotModelInput>(OpenAIChat.route, { provider: id }, { mapInput })
-const responsesModel = Route.model<CopilotModelInput>(OpenAIResponses.route, { provider: id }, { mapInput })
-
-export const responses = (modelID: string | ModelID, options: ModelOptions) =>
-  responsesModel({ ...options, id: modelID })
-
-export const chat = (modelID: string | ModelID, options: ModelOptions) => chatModel({ ...options, id: modelID })
-
-export const model = (modelID: string | ModelID, options: ModelOptions) => {
-  const create = shouldUseResponsesApi(modelID) ? responsesModel : chatModel
-  return create({ ...options, id: modelID })
+const defaults = (options: ModelOptions) => {
+  const { apiKey: _, auth: _auth, baseURL: _baseURL, ...rest } = options
+  return rest
 }

-export const provider = Provider.make({
-  id,
-  model,
-  apis: { responses, chat },
-})
+const configuredResponsesRoute = (options: ModelOptions) =>
+  responsesRoute.with({
+    endpoint: { baseURL: options.baseURL },
+    auth: AuthOptions.bearer(options, []),
+  })

-export const apis = provider.apis
+const configuredChatRoute = (options: ModelOptions) =>
+  chatRoute.with({
+    endpoint: { baseURL: options.baseURL },
+    auth: AuthOptions.bearer(options, []),
+  })
+
+export const configure = (options: ModelOptions) => {
+  const responsesRoute = configuredResponsesRoute(options)
+  const chatRoute = configuredChatRoute(options)
+  const responses = (modelID: string | ModelID) =>
+    responsesRoute.with(withOpenAIOptions(modelID, defaults(options))).model({ id: modelID })
+  const chat = (modelID: string | ModelID) =>
+    chatRoute.with(withOpenAIOptions(modelID, defaults(options))).model({ id: modelID })
+  return {
+    id,
+    model: (modelID: string | ModelID) => (shouldUseResponsesApi(modelID) ? responses(modelID) : chat(modelID)),
+    responses,
+    chat,
+    configure,
+  }
+}
+
+export const provider = {
+  id,
+  configure,
+}
--- a/packages/llm/src/providers/google.ts
+++ b/packages/llm/src/providers/google.ts
@@ -1,5 +1,6 @@
-import type { RouteModelInput } from "../route/client"
-import { Provider } from "../provider"
+import type { RouteDefaultsInput } from "../route/client"
+import { Auth } from "../route/auth"
+import type { ProviderAuthOption } from "../route/auth-options"
 import { ProviderID, type ModelID } from "../schema"
 import * as Gemini from "../protocols/gemini"

@@ -7,12 +8,28 @@ export const id = ProviderID.make("google")

 export const routes = [Gemini.route]

-export const model = (
-  id: string | ModelID,
-  options: Omit<RouteModelInput, "id" | "baseURL"> & { readonly baseURL?: string } = {},
-) => Gemini.model({ ...options, id })
+export type Config = RouteDefaultsInput & ProviderAuthOption<"optional"> & { readonly baseURL?: string }

-export const provider = Provider.make({
-  id,
-  model,
-})
+const auth = (options: ProviderAuthOption<"optional">) => {
+  if ("auth" in options && options.auth) return options.auth
+  return Auth.optional("apiKey" in options ? options.apiKey : undefined, "apiKey")
+    .orElse(Auth.config("GOOGLE_GENERATIVE_AI_API_KEY"))
+    .pipe(Auth.header("x-goog-api-key"))
+}
+
+const configuredRoute = (input: Config) => {
+  const { apiKey: _, auth: _auth, baseURL, ...rest } = input
+  return Gemini.route.with({ ...rest, endpoint: { baseURL }, auth: auth(input) })
+}
+
+export const configure = (input: Config = {}) => {
+  const route = configuredRoute(input)
+  return {
+    id,
+    model: (modelID: string | ModelID) => route.model({ id: modelID }),
+    configure,
+  }
+}
+
+export const provider = configure()
+export const model = provider.model
--- a/packages/llm/src/providers/index.ts
+++ b/packages/llm/src/providers/index.ts
@@ -2,6 +2,7 @@ export * as Anthropic from "./anthropic"
 export * as AmazonBedrock from "./amazon-bedrock"
 export * as Azure from "./azure"
 export * as Cloudflare from "./cloudflare"
+export { CloudflareAIGateway, CloudflareWorkersAI } from "./cloudflare"
 export * as GitHubCopilot from "./github-copilot"
 export * as Google from "./google"
 export * as OpenAI from "./openai"
--- a/packages/llm/src/providers/openai-compatible.ts
+++ b/packages/llm/src/providers/openai-compatible.ts
@@ -1,56 +1,60 @@
-import { Provider } from "../provider"
 import { ProviderID, type ModelID } from "../schema"
 import * as OpenAICompatibleChat from "../protocols/openai-compatible-chat"
-import type { OpenAICompatibleChatModelInput } from "../protocols/openai-compatible-chat"
+import type { RouteDefaultsInput } from "../route/client"
+import { AuthOptions, type ProviderAuthOption } from "../route/auth-options"
 import { profiles, type OpenAICompatibleProfile } from "./openai-compatible-profile"

 export const id = ProviderID.make("openai-compatible")

-export type ModelOptions = Omit<OpenAICompatibleChatModelInput, "id" | "provider"> & {
-  readonly provider: string
-}
+type GenericModelOptions = RouteDefaultsInput &
+  ProviderAuthOption<"optional"> & {
+    readonly provider?: string
+    readonly baseURL: string
+  }

-type GenericModelOptions = Omit<ModelOptions, "provider"> & {
-  readonly provider?: string
-}
-
-export type FamilyModelOptions = Omit<OpenAICompatibleChatModelInput, "id" | "provider" | "baseURL"> & {
-  readonly baseURL?: string
-}
+export type FamilyModelOptions = RouteDefaultsInput &
+  ProviderAuthOption<"optional"> & {
+    readonly baseURL?: string
+  }

 export const routes = [OpenAICompatibleChat.route]

-export const model = (id: string | ModelID, options: ModelOptions) => {
-  return OpenAICompatibleChat.model({
-    ...options,
-    id,
-    provider: ProviderID.make(options.provider),
+export const configure = (input: GenericModelOptions) => {
+  const provider = input.provider ?? "openai-compatible"
+  const { provider: _, baseURL, apiKey: _apiKey, auth: _auth, ...rest } = input
+  const route = OpenAICompatibleChat.route.with({
+    ...rest,
+    provider,
+    endpoint: { baseURL },
+    auth: AuthOptions.bearer(input, []),
  })
+  return {
+    id: ProviderID.make(provider),
+    model: (modelID: string | ModelID) => route.model({ id: modelID, provider: ProviderID.make(provider) }),
+    configure,
+  }
 }

-export const profileModel = (
-  profile: OpenAICompatibleProfile,
-  id: string | ModelID,
-  options: FamilyModelOptions = {},
-) =>
-  OpenAICompatibleChat.model({
-    ...options,
-    id,
-    provider: profile.provider,
-    baseURL: options.baseURL ?? profile.baseURL,
-  })
+const define = (profile: OpenAICompatibleProfile) => {
+  const configureProfile = (input: FamilyModelOptions = {}) => {
+    const facade = configure({
+      ...input,
+      baseURL: input.baseURL ?? profile.baseURL,
+      provider: profile.provider,
+    })
+    return {
+      id: ProviderID.make(profile.provider),
+      model: facade.model,
+      configure: configureProfile,
+    }
+  }
+  return configureProfile()
+}

-const define = (profile: OpenAICompatibleProfile) =>
-  Provider.make({
-    id: ProviderID.make(profile.provider),
-    model: (id: string | ModelID, options: FamilyModelOptions = {}) => profileModel(profile, id, options),
-  })
-
-export const provider = Provider.make({
+export const provider = {
  id,
-  model: (id: string | ModelID, options: GenericModelOptions) =>
-    model(id, { ...options, provider: options.provider ?? "openai-compatible" }),
-})
+  configure,
+}

 export const baseten = define(profiles.baseten)
 export const cerebras = define(profiles.cerebras)
--- a/packages/llm/src/providers/openai-options.ts
+++ b/packages/llm/src/providers/openai-options.ts
@@ -59,10 +59,9 @@ export const withOpenAIOptions = <Options extends { readonly providerOptions?: O
  modelID: string,
  options: Options,
  defaults: { readonly textVerbosity?: boolean } = {},
-): Options & { readonly id: string; readonly providerOptions?: ProviderOptions } => {
+): Omit<Options, "providerOptions"> & { readonly providerOptions?: ProviderOptions } => {
  return {
    ...options,
-    id: modelID,
    providerOptions: mergeProviderOptions(openAIDefaultOptions(modelID, defaults), options.providerOptions),
  }
 }
--- a/packages/llm/src/providers/openai.ts
+++ b/packages/llm/src/providers/openai.ts
@@ -1,6 +1,5 @@
 import { AuthOptions, type ProviderAuthOption } from "../route/auth-options"
-import type { RouteModelInput } from "../route/client"
-import { Provider } from "../provider"
+import type { Route, RouteDefaultsInput } from "../route/client"
 import { ProviderID, type ModelID } from "../schema"
 import * as OpenAIChat from "../protocols/openai-chat"
 import * as OpenAIResponses from "../protocols/openai-responses"
@@ -15,39 +14,50 @@ export const routes = [OpenAIResponses.route, OpenAIResponses.webSocketRoute, Op
 // This provider facade wraps the lower-level Responses and Chat model factories
 // with OpenAI-specific conveniences: typed options, API-key sugar, env fallback,
 // and default option normalization.
-type OpenAIModelInput<ModelInput> = Omit<ModelInput, "apiKey" | "auth" | "baseURL"> &
+export type Config = RouteDefaultsInput &
  ProviderAuthOption<"optional"> & {
    readonly baseURL?: string
+    readonly queryParams?: Record<string, string>
    readonly providerOptions?: OpenAIProviderOptionsInput
  }

 const auth = (options: ProviderAuthOption<"optional">) => AuthOptions.bearer(options, "OPENAI_API_KEY")

-export const responses = (id: string | ModelID, options: OpenAIModelInput<Omit<RouteModelInput, "id">> = {}) => {
-  const { apiKey: _, ...rest } = options
-  return OpenAIResponses.model(withOpenAIOptions(id, { ...rest, auth: auth(options) }, { textVerbosity: true }))
+const defaults = (input: Config) => {
+  const { apiKey: _, auth: _auth, baseURL: _baseURL, queryParams: _queryParams, ...rest } = input
+  return rest
 }

-export const responsesWebSocket = (
-  id: string | ModelID,
-  options: OpenAIModelInput<Omit<RouteModelInput, "id">> = {},
-) => {
-  const { apiKey: _, ...rest } = options
-  return OpenAIResponses.webSocketModel(
-    withOpenAIOptions(id, { ...rest, auth: auth(options) }, { textVerbosity: true }),
-  )
+const configuredRoute = <Body, Prepared>(route: Route<Body, Prepared>, input: Config) =>
+  route.with({
+    auth: auth(input),
+    endpoint: { baseURL: input.baseURL, query: input.queryParams },
+  })
+
+export const configure = (input: Config = {}) => {
+  const responsesRoute = configuredRoute(OpenAIResponses.route, input)
+  const responsesWebSocketRoute = configuredRoute(OpenAIResponses.webSocketRoute, input)
+  const chatRoute = configuredRoute(OpenAIChat.route, input)
+  const modelDefaults = defaults(input)
+  const responses = (id: string | ModelID) =>
+    responsesRoute.with(withOpenAIOptions(id, modelDefaults, { textVerbosity: true })).model({ id })
+  const responsesWebSocket = (id: string | ModelID) =>
+    responsesWebSocketRoute.with(withOpenAIOptions(id, modelDefaults, { textVerbosity: true })).model({ id })
+  const chat = (id: string | ModelID) => chatRoute.with(withOpenAIOptions(id, modelDefaults)).model({ id })
+
+  return {
+    id,
+    model: responses,
+    responses,
+    responsesWebSocket,
+    chat,
+    configure,
+  }
 }

-export const chat = (id: string | ModelID, options: OpenAIModelInput<Omit<RouteModelInput, "id">> = {}) => {
-  const { apiKey: _, ...rest } = options
-  return OpenAIChat.model(withOpenAIOptions(id, { ...rest, auth: auth(options) }))
-}
-
-export const provider = Provider.make({
-  id,
-  model: responses,
-  apis: { responses, responsesWebSocket, chat },
-})
+export const provider = configure()

 export const model = provider.model
-export const apis = provider.apis
+export const responses = provider.responses
+export const responsesWebSocket = provider.responsesWebSocket
+export const chat = provider.chat
--- a/packages/llm/src/providers/openrouter.ts
+++ b/packages/llm/src/providers/openrouter.ts
@@ -1,9 +1,9 @@
 import { Effect, Schema } from "effect"
-import { Route, type RouteModelInput } from "../route/client"
+import { Route, type RouteDefaultsInput } from "../route/client"
 import { Endpoint } from "../route/endpoint"
 import { Framing } from "../route/framing"
-import { Provider } from "../provider"
 import { Protocol } from "../route/protocol"
+import { AuthOptions, type ProviderAuthOption } from "../route/auth-options"
 import { ProviderID, type ModelID, type ProviderOptions } from "../schema"
 import * as OpenAICompatibleProfiles from "./openai-compatible-profile"
 import * as OpenAIChat from "../protocols/openai-chat"
@@ -24,11 +24,11 @@ export type OpenRouterProviderOptionsInput = ProviderOptions & {
  readonly openrouter?: OpenRouterOptions
 }

-export type ModelOptions = Omit<RouteModelInput, "id" | "baseURL" | "providerOptions"> & {
-  readonly baseURL?: string
-  readonly providerOptions?: OpenRouterProviderOptionsInput
-}
-type ModelInput = ModelOptions & Pick<RouteModelInput, "id">
+export type ModelOptions = Omit<RouteDefaultsInput, "providerOptions"> &
+  ProviderAuthOption<"optional"> & {
+    readonly baseURL?: string
+    readonly providerOptions?: OpenRouterProviderOptionsInput
+  }

 const OpenRouterBody = Schema.StructWithRest(Schema.Struct(OpenAIChat.bodyFields), [
  Schema.Record(Schema.String, Schema.Any),
@@ -68,21 +68,31 @@ const bodyOptions = (input: unknown) => {

 export const route = Route.make({
  id: ADAPTER,
+  provider: profile.provider,
  protocol,
-  endpoint: Endpoint.path("/chat/completions"),
+  endpoint: Endpoint.path("/chat/completions", { baseURL: profile.baseURL }),
  framing: Framing.sse,
 })

 export const routes = [route]

-const modelRef = Route.model<ModelInput>(route, {
-  provider: profile.provider,
-  baseURL: profile.baseURL,
-})
+const configuredRoute = (input: ModelOptions) => {
+  const { apiKey: _, auth: _auth, baseURL, ...rest } = input
+  return route.with({
+    ...rest,
+    endpoint: { baseURL: baseURL ?? profile.baseURL },
+    auth: AuthOptions.bearer(input, "OPENROUTER_API_KEY"),
+  })
+}

-export const model = (id: string | ModelID, options: ModelOptions = {}) => modelRef({ ...options, id })
+export const configure = (input: ModelOptions = {}) => {
+  const route = configuredRoute(input)
+  return {
+    id,
+    model: (modelID: string | ModelID) => route.model({ id: modelID }),
+    configure,
+  }
+}

-export const provider = Provider.make({
-  id,
-  model,
-})
+export const provider = configure()
+export const model = provider.model
--- a/packages/llm/src/providers/xai.ts
+++ b/packages/llm/src/providers/xai.ts
@@ -1,7 +1,5 @@
 import { AuthOptions, type ProviderAuthOption } from "../route/auth-options"
-import { Route } from "../route/client"
-import type { RouteModelInput } from "../route/client"
-import { Provider } from "../provider"
+import type { RouteDefaultsInput } from "../route/client"
 import { ProviderID, type ModelID } from "../schema"
 import * as OpenAICompatibleProfiles from "./openai-compatible-profile"
 import * as OpenAICompatibleChat from "../protocols/openai-compatible-chat"
@@ -9,44 +7,50 @@ import * as OpenAIResponses from "../protocols/openai-responses"

 export const id = ProviderID.make("xai")

-export type ModelOptions = Omit<RouteModelInput, "id" | "apiKey" | "auth" | "baseURL"> &
+export type ModelOptions = RouteDefaultsInput &
  ProviderAuthOption<"optional"> & {
    readonly baseURL?: string
  }

 export const routes = [OpenAIResponses.route, OpenAICompatibleChat.route]

-const responsesModel = Route.model(OpenAIResponses.route, { provider: id })
-const chatModel = OpenAICompatibleChat.model
-
 const auth = (options: ProviderAuthOption<"optional">) => AuthOptions.bearer(options, "XAI_API_KEY")

-export const responses = (modelID: string | ModelID, options: ModelOptions = {}) => {
-  const { apiKey: _, ...rest } = options
-  return responsesModel({
+const configuredResponsesRoute = (input: ModelOptions) => {
+  const { apiKey: _, auth: _auth, baseURL, ...rest } = input
+  return OpenAIResponses.route.with({
    ...rest,
-    auth: auth(options),
-    id: modelID,
-    baseURL: options.baseURL ?? OpenAICompatibleProfiles.profiles.xai.baseURL,
-  })
-}
-
-export const chat = (modelID: string | ModelID, options: ModelOptions = {}) => {
-  const { apiKey: _, ...rest } = options
-  return chatModel({
-    ...rest,
-    auth: auth(options),
-    id: modelID,
    provider: id,
-    baseURL: options.baseURL ?? OpenAICompatibleProfiles.profiles.xai.baseURL,
+    endpoint: { baseURL: baseURL ?? OpenAICompatibleProfiles.profiles.xai.baseURL },
+    auth: auth(input),
  })
 }

-export const provider = Provider.make({
-  id,
-  model: responses,
-  apis: { responses, chat },
-})
+const configuredChatRoute = (input: ModelOptions) => {
+  const { apiKey: _, auth: _auth, baseURL, ...rest } = input
+  return OpenAICompatibleChat.route.with({
+    ...rest,
+    provider: id,
+    endpoint: { baseURL: baseURL ?? OpenAICompatibleProfiles.profiles.xai.baseURL },
+    auth: auth(input),
+  })
+}

+export const configure = (input: ModelOptions = {}) => {
+  const responsesRoute = configuredResponsesRoute(input)
+  const chatRoute = configuredChatRoute(input)
+  const responses = (modelID: string | ModelID) => responsesRoute.model({ id: modelID })
+  const chat = (modelID: string | ModelID) => chatRoute.model({ id: modelID })
+  return {
+    id,
+    model: responses,
+    responses,
+    chat,
+    configure,
+  }
+}
+
+export const provider = configure()
 export const model = provider.model
-export const apis = provider.apis
+export const responses = provider.responses
+export const chat = provider.chat
--- a/packages/llm/src/route/auth.ts
+++ b/packages/llm/src/route/auth.ts
@@ -12,6 +12,7 @@ export class MissingCredentialError extends Error {

 export type CredentialError = MissingCredentialError | Config.ConfigError
 export type AuthError = CredentialError | LLMError
+type Secret = string | Redacted.Redacted | Config.Config<string | Redacted.Redacted>

 export interface AuthInput {
  readonly request: LLMRequest
@@ -22,7 +23,7 @@ export interface AuthInput {
 }

 export interface Credential {
-  readonly load: Effect.Effect<Redacted.Redacted<string>, CredentialError>
+  readonly load: Effect.Effect<Redacted.Redacted, CredentialError>
  readonly orElse: (that: Credential) => Credential
  readonly bearer: () => Auth
  readonly header: (name: string) => Auth
@@ -39,7 +40,7 @@ export interface Auth {
 export const isAuth = (input: unknown): input is Auth =>
  typeof input === "object" && input !== null && "apply" in input && typeof input.apply === "function"

-const credential = (load: Effect.Effect<Redacted.Redacted<string>, CredentialError>): Credential => {
+const credential = (load: Effect.Effect<Redacted.Redacted, CredentialError>): Credential => {
  const self: Credential = {
    load,
    orElse: (that) => credential(load.pipe(Effect.catch(() => that.load))),
@@ -66,16 +67,13 @@ const fromCredential = (source: Credential, render: (secret: string) => Headers.
    source.load.pipe(Effect.map((secret) => Headers.setAll(input.headers, render(Redacted.value(secret))))),
  )

-const secretEffect = (secret: string | Redacted.Redacted<string>, source: string) => {
+const secretEffect = (secret: string | Redacted.Redacted, source: string) => {
  const redacted = typeof secret === "string" ? Redacted.make(secret) : secret
  if (Redacted.value(redacted) === "") return Effect.fail(new MissingCredentialError(source))
  return Effect.succeed(redacted)
 }

-const credentialFromSecret = (
-  secret: string | Redacted.Redacted<string> | Config.Config<string | Redacted.Redacted<string>>,
-  source: string,
-) => {
+const credentialFromSecret = (secret: Secret, source: string) => {
  if (typeof secret === "string" || Redacted.isRedacted(secret)) return credential(secretEffect(secret, source))
  return credential(
    Effect.gen(function* () {
@@ -86,17 +84,14 @@ const credentialFromSecret = (

 export const value = (secret: string, source = "value") => credentialFromSecret(secret, source)

-export const optional = (
-  secret: string | Redacted.Redacted<string> | Config.Config<string | Redacted.Redacted<string>> | undefined,
-  source = "optional value",
-) =>
+export const optional = (secret: Secret | undefined, source = "optional value") =>
  secret === undefined
    ? credential(Effect.fail(new MissingCredentialError(source)))
    : credentialFromSecret(secret, source)

 export const config = (name: string) => credentialFromSecret(Config.redacted(name), name)

-export const effect = (load: Effect.Effect<Redacted.Redacted<string>, CredentialError>) => credential(load)
+export const effect = (load: Effect.Effect<Redacted.Redacted, CredentialError>) => credential(load)

 export const none = auth((input) => Effect.succeed(input.headers))

@@ -109,68 +104,32 @@ export const custom = (apply: (input: AuthInput) => Effect.Effect<Headers.Header

 export const passthrough = none

-const fromModelApiKey = (from: (apiKey: string) => Headers.Input) =>
-  auth(({ request, headers }) => {
-    const key = request.model.apiKey
-    if (!key) return Effect.succeed(headers)
-    return Effect.succeed(Headers.setAll(headers, from(key)))
-  })
-
-const credentialInput = (
-  source: string | Redacted.Redacted<string> | Config.Config<string | Redacted.Redacted<string>> | Credential,
-) =>
+const credentialInput = (source: Secret | Credential) =>
  typeof source === "string" || Redacted.isRedacted(source) || Config.isConfig(source)
    ? credentialFromSecret(source, "value")
    : source

-export function bearer(): Auth
-export function bearer(
-  source: string | Redacted.Redacted<string> | Config.Config<string | Redacted.Redacted<string>> | Credential,
-): Auth
-export function bearer(
-  source?: string | Redacted.Redacted<string> | Config.Config<string | Redacted.Redacted<string>> | Credential,
-) {
-  if (source === undefined) return fromModelApiKey((key) => ({ authorization: `Bearer ${key}` }))
+export function bearer(source: Secret | Credential): Auth
+export function bearer(source: Secret | Credential) {
  return credentialInput(source).bearer()
 }

 export const apiKey = bearer

-export const apiKeyHeader = (name: string) => fromModelApiKey((key) => ({ [name]: key }))
-
-export function header(
-  name: string,
-): (source: string | Redacted.Redacted<string> | Config.Config<string | Redacted.Redacted<string>> | Credential) => Auth
-export function header(
-  name: string,
-  source: string | Redacted.Redacted<string> | Config.Config<string | Redacted.Redacted<string>> | Credential,
-): Auth
-export function header(
-  name: string,
-  source?: string | Redacted.Redacted<string> | Config.Config<string | Redacted.Redacted<string>> | Credential,
-) {
+export function header(name: string): (source: Secret | Credential) => Auth
+export function header(name: string, source: Secret | Credential): Auth
+export function header(name: string, source?: Secret | Credential) {
  if (source === undefined) {
-    return (
-      next: string | Redacted.Redacted<string> | Config.Config<string | Redacted.Redacted<string>> | Credential,
-    ) => credentialInput(next).header(name)
+    return (next: Secret | Credential) => credentialInput(next).header(name)
  }
  return credentialInput(source).header(name)
 }

-export function bearerHeader(
-  name: string,
-): (source: string | Redacted.Redacted<string> | Config.Config<string | Redacted.Redacted<string>> | Credential) => Auth
-export function bearerHeader(
-  name: string,
-  source: string | Redacted.Redacted<string> | Config.Config<string | Redacted.Redacted<string>> | Credential,
-): Auth
-export function bearerHeader(
-  name: string,
-  source?: string | Redacted.Redacted<string> | Config.Config<string | Redacted.Redacted<string>> | Credential,
-) {
-  const render = (
-    input: string | Redacted.Redacted<string> | Config.Config<string | Redacted.Redacted<string>> | Credential,
-  ) => fromCredential(credentialInput(input), (secret) => ({ [name]: `Bearer ${secret}` }))
+export function bearerHeader(name: string): (source: Secret | Credential) => Auth
+export function bearerHeader(name: string, source: Secret | Credential): Auth
+export function bearerHeader(name: string, source?: Secret | Credential) {
+  const render = (input: Secret | Credential) =>
+    fromCredential(credentialInput(input), (secret) => ({ [name]: `Bearer ${secret}` }))
  if (source === undefined) return render
  return render(source)
 }
--- a/packages/llm/src/route/client.ts
+++ b/packages/llm/src/route/client.ts
@@ -1,31 +1,28 @@
 import { Cause, Context, Effect, Layer, Schema, Stream } from "effect"
-import type { Auth as AuthDef } from "./auth"
-import type { Endpoint } from "./endpoint"
+import * as Option from "effect/Option"
+import { Auth, type Auth as AuthDef } from "./auth"
+import { Endpoint, type EndpointPatch } from "./endpoint"
 import { RequestExecutor } from "./executor"
 import type { Framing } from "./framing"
 import { HttpTransport } from "./transport"
 import type { Transport, TransportRuntime } from "./transport"
 import { WebSocketExecutor } from "./transport"
-import type { Service as WebSocketExecutorService } from "./transport/websocket"
 import type { Protocol } from "./protocol"
 import { applyCachePolicy } from "../cache-policy"
 import * as ProviderShared from "../protocols/shared"
 import * as ToolRuntime from "../tool-runtime"
 import type { Tools } from "../tool"
-import type { LLMError, LLMEvent, PreparedRequestOf, ProtocolID } from "../schema"
+import type { LLMError, LLMEvent, PreparedRequestOf, ProtocolID, ProviderOptions } from "../schema"
 import {
  GenerationOptions,
  HttpOptions,
  LLMRequest,
  LLMResponse,
-  ModelID,
+  Model,
  ModelLimits,
-  ModelRef,
  LLMError as LLMErrorClass,
-  NoRouteReason,
  PreparedRequest,
  ProviderID,
-  RouteID,
  mergeGenerationOptions,
  mergeHttpOptions,
  mergeProviderOptions,
@@ -42,11 +39,13 @@ export interface Route<Body, Prepared = unknown> {
  readonly id: string
  readonly provider?: ProviderID
  readonly protocol: ProtocolID
+  readonly endpoint: Endpoint<Body>
+  readonly auth: AuthDef
  readonly transport: Transport<Body, Prepared, unknown>
  readonly defaults: RouteDefaults
  readonly body: RouteBody<Body>
  readonly with: (patch: RoutePatch<Body, Prepared>) => Route<Body, Prepared>
-  readonly model: <Input extends RouteModelInput = RouteModelInput>(input: Input) => ModelRef
+  readonly model: (input: RouteMappedModelInput) => Model
  readonly prepareTransport: (body: Body, request: LLMRequest) => Effect.Effect<Prepared, LLMError>
  readonly streamPrepared: (
    prepared: Prepared,
@@ -61,116 +60,77 @@ export interface Route<Body, Prepared = unknown> {
 // oxlint-disable-next-line typescript-eslint/no-explicit-any
 export type AnyRoute = Route<any, any>

-const routeRegistry = new Map<string, AnyRoute>()
-
-// Route lookup is intentionally global: model refs name a route id, and
-// importing the provider/protocol/custom-route module registers the runnable
-// implementation. Duplicate ids are bugs because model refs cannot disambiguate
-// them.
-const register = <R extends AnyRoute>(route: R): R => {
-  const existing = routeRegistry.get(route.id)
-  if (existing && existing !== route) throw new Error(`Duplicate LLM route id "${route.id}"`)
-  routeRegistry.set(route.id, route)
-  return route
-}
-
-const registeredRoute = (id: string) => routeRegistry.get(id)
-
 export type HttpOptionsInput = HttpOptions.Input

-export type ModelRefInput = Omit<
-  ConstructorParameters<typeof ModelRef>[0],
-  "id" | "provider" | "route" | "limits" | "generation" | "http" | "auth"
-> & {
-  readonly id: string | ModelID
-  readonly provider: string | ProviderID
-  readonly route: string | RouteID
-  readonly auth?: AuthDef
+export type RouteModelInput = Omit<Model.Input, "provider" | "route">
+
+export type RouteRoutedModelInput = Omit<Model.Input, "route">
+
+export interface RouteDefaults {
+  readonly headers?: Record<string, string>
+  readonly limits?: ModelLimits
+  readonly generation?: GenerationOptions
+  readonly providerOptions?: ProviderOptions
+  readonly http?: HttpOptions
+}
+
+export interface RouteDefaultsInput {
+  readonly headers?: Record<string, string>
  readonly limits?: ModelLimits.Input
  readonly generation?: GenerationOptions.Input
-  readonly http?: HttpOptionsInput
+  readonly providerOptions?: ProviderOptions
+  readonly http?: HttpOptions.Input
 }

-// `baseURL` is required on `ModelRefInput` (every materialized `ModelRef` has
-// a host) but optional at the route-input layers below. The route's `defaults`
-// can supply a canonical URL (e.g. OpenAI/Anthropic) so the user's input may
-// omit it. Routes without a canonical URL (OpenAI-compatible, GitHub Copilot)
-// re-tighten this in their own input type.
-export type RouteModelInput = Omit<ModelRefInput, "provider" | "route" | "baseURL"> & {
-  readonly baseURL?: string
-}
-
-export type RouteModelDefaults = Omit<ModelRefInput, "id" | "route" | "baseURL"> & {
-  readonly baseURL?: string
-}
-
-export type RouteRoutedModelInput = Omit<ModelRefInput, "route" | "baseURL"> & {
-  readonly baseURL?: string
-}
-
-export type RouteRoutedModelDefaults = Partial<Omit<ModelRefInput, "id" | "provider" | "route">>
-
-export type RouteDefaults = Partial<Omit<ModelRefInput, "id" | "provider" | "route">>
-
-export interface RoutePatch<Body, Prepared> extends RouteDefaults {
-  readonly id: string
+export interface RoutePatch<Body, Prepared> extends RouteDefaultsInput {
+  readonly id?: string
  readonly provider?: string | ProviderID
+  readonly auth?: AuthDef
  readonly transport?: Transport<Body, Prepared, unknown>
+  readonly endpoint?: EndpointPatch<Body>
 }

 type RouteMappedModelInput = RouteModelInput | RouteRoutedModelInput

-export interface RouteModelOptions<
-  Input extends RouteMappedModelInput,
-  Output extends RouteMappedModelInput = RouteMappedModelInput,
-> {
-  readonly mapInput?: (input: Input) => Output
+const makeRouteModel = (route: AnyRoute, mapped: RouteMappedModelInput) => {
+  const provider = route.provider ?? ("provider" in mapped ? mapped.provider : undefined)
+  if (!provider) throw new Error(`Route.model(${route.id}) requires a provider`)
+  if (!endpointBaseURL(route.endpoint))
+    throw new Error(`Route.model(${route.id}) requires an endpoint baseURL — configure it on the route first`)
+  return Model.make({
+    ...mapped,
+    provider,
+    route,
+  })
 }

-export interface RouteMappedModelOptions<Input, Output extends RouteMappedModelInput = RouteMappedModelInput> {
-  readonly mapInput: (input: Input) => Output
-}
-
-const modelWithDefaults =
-  <Input>(
-    route: AnyRoute,
-    defaults: Partial<Omit<ModelRefInput, "id" | "route">>,
-    options: { readonly mapInput?: (input: Input) => RouteMappedModelInput },
-  ) =>
-  (input: Input) => {
-    const mapped = options.mapInput === undefined ? (input as RouteMappedModelInput) : options.mapInput(input)
-    const provider = defaults.provider ?? route.provider ?? ("provider" in mapped ? mapped.provider : undefined)
-    if (!provider) throw new Error(`Route.model(${route.id}) requires a provider`)
-    const baseURL = mapped.baseURL ?? defaults.baseURL ?? route.defaults.baseURL
-    if (!baseURL)
-      throw new Error(`Route.model(${route.id}) requires a baseURL — supply it via input, defaults, or route defaults`)
-    const generation = mergeGenerationOptions(route.defaults.generation, defaults.generation)
-    const providerOptions = mergeProviderOptions(route.defaults.providerOptions, defaults.providerOptions)
-    const http = mergeHttpOptions(httpOptions(route.defaults.http), httpOptions(defaults.http))
-    return modelRef({
-      ...route.defaults,
-      ...defaults,
-      ...mapped,
-      baseURL,
-      provider,
-      route: route.id,
-      limits: mapped.limits ?? defaults.limits ?? route.defaults.limits,
-      generation: mergeGenerationOptions(generation, mapped.generation),
-      providerOptions: mergeProviderOptions(providerOptions, mapped.providerOptions),
-      http: mergeHttpOptions(http, httpOptions(mapped.http)),
-    })
+const mergeRouteDefaults = (base: RouteDefaults | undefined, patch: RouteDefaultsInput): RouteDefaults => {
+  const headers = mergeHeaders(base?.headers, patch.headers)
+  return {
+    ...base,
+    ...patch,
+    headers,
+    limits: patch.limits === undefined ? base?.limits : ModelLimits.make(patch.limits),
+    generation: mergeGenerationOptions(generationOptions(base?.generation), generationOptions(patch.generation)),
+    providerOptions: mergeProviderOptions(base?.providerOptions, patch.providerOptions),
+    http: mergeHttpOptions(
+      base?.http,
+      httpOptions(patch.http),
+      headers === undefined ? undefined : new HttpOptions({ headers }),
+    ),
  }
+}

-const mergeRouteDefaults = (base: RouteDefaults | undefined, patch: RouteDefaults): RouteDefaults => ({
-  ...base,
-  ...patch,
-  limits: patch.limits ?? base?.limits,
-  generation: mergeGenerationOptions(generationOptions(base?.generation), generationOptions(patch.generation)),
-  providerOptions: mergeProviderOptions(base?.providerOptions, patch.providerOptions),
-  http: mergeHttpOptions(httpOptions(base?.http), httpOptions(patch.http)),
-})
+const endpointBaseURL = <Body>(endpoint: Endpoint<Body>) =>
+  typeof endpoint.baseURL === "string" ? endpoint.baseURL : undefined

-export const modelLimits = ModelLimits.make
+const mergeHeaders = (...items: ReadonlyArray<Record<string, string> | undefined>) => {
+  const entries = items.flatMap((item) =>
+    item === undefined ? [] : Object.entries(item).filter((entry): entry is [string, string] => entry[1] !== undefined),
+  )
+  if (entries.length === 0) return undefined
+  return Object.fromEntries(entries)
+}

 export const generationOptions = (input: GenerationOptions.Input | undefined) =>
  input === undefined ? undefined : GenerationOptions.make(input)
@@ -180,40 +140,6 @@ export const httpOptions = (input: HttpOptionsInput | undefined) => {
  return HttpOptions.make(input)
 }

-export const modelRef = (input: ModelRefInput) =>
-  new ModelRef({
-    ...input,
-    id: ModelID.make(input.id),
-    provider: ProviderID.make(input.provider),
-    route: RouteID.make(input.route),
-    limits: modelLimits(input.limits),
-    generation: generationOptions(input.generation),
-    http: httpOptions(input.http),
-  })
-
-function model<Input extends RouteModelInput = RouteModelInput>(
-  route: AnyRoute,
-  defaults: RouteModelDefaults,
-  options?: RouteModelOptions<Input, RouteModelInput>,
-): (input: Input) => ModelRef
-function model<Input extends RouteRoutedModelInput = RouteRoutedModelInput>(
-  route: AnyRoute,
-  defaults?: RouteRoutedModelDefaults,
-  options?: RouteModelOptions<Input, RouteRoutedModelInput>,
-): (input: Input) => ModelRef
-function model<Input, Output extends RouteMappedModelInput = RouteMappedModelInput>(
-  route: AnyRoute,
-  defaults: Partial<Omit<ModelRefInput, "id" | "route">>,
-  options: RouteMappedModelOptions<Input, Output>,
-): (input: Input) => ModelRef
-function model<Input>(
-  route: AnyRoute,
-  defaults: Partial<Omit<ModelRefInput, "id" | "route">> = {},
-  options: { readonly mapInput?: (input: Input) => RouteMappedModelInput } = {},
-) {
-  return modelWithDefaults(route, defaults, options)
-}
-
 export interface Interface {
  /**
   * Compile a request through protocol body construction, validation, and HTTP
@@ -242,22 +168,16 @@ export interface GenerateMethod {

 export class Service extends Context.Service<Service, Interface>()("@opencode/LLMClient") {}

-const noRoute = (model: ModelRef) =>
-  new LLMErrorClass({
-    module: "LLMClient",
-    method: "resolveRoute",
-    reason: new NoRouteReason({ route: model.route, provider: model.provider, model: model.id }),
-  })
-
 const resolveRequestOptions = (request: LLMRequest) =>
  LLMRequest.update(request, {
-    generation: mergeGenerationOptions(request.model.generation, request.generation) ?? new GenerationOptions({}),
-    providerOptions: mergeProviderOptions(request.model.providerOptions, request.providerOptions),
-    http: mergeHttpOptions(request.model.http, request.http),
+    generation:
+      mergeGenerationOptions(request.model.route.defaults.generation, request.generation) ?? new GenerationOptions({}),
+    providerOptions: mergeProviderOptions(request.model.route.defaults.providerOptions, request.providerOptions),
+    http: mergeHttpOptions(request.model.route.defaults.http, request.http),
  })

 export interface MakeInput<Body, Frame, Event, State> {
-  /** Route id used in registry lookup and error messages. */
+  /** Route id used in diagnostics and prepared request metadata. */
  readonly id: string
  /** Provider identity for route-owned model construction. */
  readonly provider?: string | ProviderID
@@ -265,27 +185,33 @@ export interface MakeInput<Body, Frame, Event, State> {
  readonly protocol: Protocol<Body, Frame, Event, State>
  /** Where the request is sent. */
  readonly endpoint: Endpoint<Body>
-  /** Per-request transport auth. Model-level `Auth` overrides this. */
+  /** Per-request transport auth. Provider facades override this via `route.with(...)`. */
  readonly auth?: AuthDef
  /** Stream framing — bytes -> frames before `protocol.stream.event` decoding. */
  readonly framing: Framing<Frame>
  /** Static / per-request headers added before `auth` runs. */
  readonly headers?: (input: { readonly request: LLMRequest }) => Record<string, string>
-  /** Model defaults used by the route's `.model(...)` helper. */
-  readonly defaults?: RouteDefaults
+  /** Route/request defaults used when compiling requests for this route. */
+  readonly defaults?: RouteDefaultsInput
 }

 export interface MakeTransportInput<Body, Prepared, Frame, Event, State> {
-  /** Route id used in registry lookup and error messages. */
+  /** Route id used in diagnostics and prepared request metadata. */
  readonly id: string
  /** Provider identity for route-owned model construction. */
  readonly provider?: string | ProviderID
  /** Semantic API contract — owns body construction, body schema, and parsing. */
  readonly protocol: Protocol<Body, Frame, Event, State>
+  /** Where the request is sent. */
+  readonly endpoint: Endpoint<Body>
+  /** Per-request transport auth. Provider facades override this via `route.with(...)`. */
+  readonly auth?: AuthDef
+  /** Static / per-request headers added before `auth` runs. */
+  readonly headers?: (input: { readonly request: LLMRequest }) => Record<string, string>
  /** Runnable transport route. */
  readonly transport: Transport<Body, Prepared, Frame>
-  /** Provider/model defaults used by the route's `.model(...)` helper. */
-  readonly defaults?: RouteDefaults
+  /** Route/request defaults used when compiling requests for this route. */
+  readonly defaults?: RouteDefaultsInput
 }

 const streamError = (route: string, message: string, cause: Cause.Cause<unknown>) => {
@@ -298,6 +224,7 @@ function makeFromTransport<Body, Prepared, Frame, Event, State>(
  input: MakeTransportInput<Body, Prepared, Frame, Event, State>,
 ): Route<Body, Prepared> {
  const protocol = input.protocol
+  const encodeBody = Schema.encodeSync(Schema.fromJsonString(protocol.body.schema))
  const decodeEventEffect = Schema.decodeUnknownEffect(protocol.stream.event)
  const decodeEvent = (route: string) => (frame: Frame) =>
    decodeEventEffect(frame).pipe(
@@ -310,29 +237,44 @@ function makeFromTransport<Body, Prepared, Frame, Event, State>(
      ),
    )

-  const build = (routeInput: MakeTransportInput<Body, Prepared, Frame, Event, State>): Route<Body, Prepared> => {
+  type BuiltRouteInput = Omit<MakeTransportInput<Body, Prepared, Frame, Event, State>, "defaults"> & {
+    readonly defaults?: RouteDefaults
+  }
+
+  const build = (routeInput: BuiltRouteInput): Route<Body, Prepared> => {
    const route: Route<Body, Prepared> = {
      id: routeInput.id,
      provider: routeInput.provider === undefined ? undefined : ProviderID.make(routeInput.provider),
      protocol: protocol.id,
+      endpoint: routeInput.endpoint,
+      auth: routeInput.auth ?? Auth.none,
      transport: routeInput.transport,
      defaults: routeInput.defaults ?? {},
      body: protocol.body,
      with: (patch: RoutePatch<Body, Prepared>) => {
-        const { id, provider, transport, ...defaults } = patch
-        if (!id || id === routeInput.id) throw new Error(`Route.with(${routeInput.id}) requires a new route id`)
+        const { id, provider, auth, transport, endpoint, ...defaults } = patch
        return build({
          ...routeInput,
-          id,
+          id: id ?? routeInput.id,
          provider: provider ?? routeInput.provider,
+          auth: auth ?? routeInput.auth,
+          endpoint: endpoint ? Endpoint.merge(routeInput.endpoint, endpoint) : routeInput.endpoint,
          transport: (transport as Transport<Body, Prepared, Frame> | undefined) ?? routeInput.transport,
-          defaults: mergeRouteDefaults(routeInput.defaults, defaults),
+          defaults: mergeRouteDefaults(route.defaults, defaults),
        })
      },
-      model: (input: RouteModelInput): ModelRef => modelWithDefaults<RouteModelInput>(route, {}, {})(input),
-      prepareTransport: routeInput.transport.prepare,
+      model: (input) => makeRouteModel(route, input),
+      prepareTransport: (body, request) =>
+        routeInput.transport.prepare({
+          body,
+          request,
+          endpoint: routeInput.endpoint,
+          auth: routeInput.auth ?? Auth.none,
+          encodeBody,
+          headers: routeInput.headers,
+        }),
      streamPrepared: (prepared: Prepared, request: LLMRequest, runtime: TransportRuntime) => {
-        const route = `${request.model.provider}/${request.model.route}`
+        const route = `${request.model.provider}/${request.model.route.id}`
        const events = routeInput.transport
          .frames(prepared, request, runtime)
          .pipe(
@@ -349,10 +291,10 @@ function makeFromTransport<Body, Prepared, Frame, Event, State>(
        )
      },
    } satisfies Route<Body, Prepared>
-    return register(route)
+    return route
  }

-  return build(input)
+  return build({ ...input, defaults: mergeRouteDefaults(undefined, input.defaults ?? {}) })
 }

 export function make<Body, Prepared, Frame, Event, State>(
@@ -381,18 +323,14 @@ export function make<Body, Prepared, Frame, Event, State>(
 ): Route<Body, Prepared> | Route<Body, HttpTransport.HttpPrepared<Frame>> {
  if ("transport" in input) return makeFromTransport(input)
  const protocol = input.protocol
-  const encodeBody = Schema.encodeSync(Schema.fromJsonString(protocol.body.schema))
  return makeFromTransport({
    id: input.id,
    provider: input.provider,
    protocol,
-    transport: HttpTransport.httpJson({
-      endpoint: input.endpoint,
-      auth: input.auth,
-      framing: input.framing,
-      encodeBody,
-      headers: input.headers,
-    }),
+    endpoint: input.endpoint,
+    auth: input.auth,
+    headers: input.headers,
+    transport: HttpTransport.httpJson({ framing: input.framing }),
    defaults: input.defaults,
  })
 }
@@ -402,8 +340,7 @@ export function make<Body, Prepared, Frame, Event, State>(
 // execute transport.
 const compile = Effect.fn("LLM.compile")(function* (request: LLMRequest) {
  const resolved = applyCachePolicy(resolveRequestOptions(request))
-  const route = registeredRoute(resolved.model.route)
-  if (!route) return yield* noRoute(resolved.model)
+  const route = resolved.model.route

  const body = yield* route.body
    .from(resolved)
@@ -495,31 +432,21 @@ export const streamRequest = (request: LLMRequest) =>
 export const layer: Layer.Layer<Service, never, RequestExecutor.Service> = Layer.effect(
  Service,
  Effect.gen(function* () {
-    const stream = streamWith(streamRequestWith({ http: yield* RequestExecutor.Service }))
+    const stream = streamWith(
+      streamRequestWith({
+        http: yield* RequestExecutor.Service,
+        webSocket: Option.getOrUndefined(yield* Effect.serviceOption(WebSocketExecutor.Service)),
+      }),
+    )
    return Service.of({ prepare: prepareWith as Interface["prepare"], stream, generate: generateWith(stream) })
  }),
 )

-export const layerWithWebSocket: Layer.Layer<Service, never, RequestExecutor.Service | WebSocketExecutorService> =
-  Layer.effect(
-    Service,
-    Effect.gen(function* () {
-      const stream = streamWith(
-        streamRequestWith({
-          http: yield* RequestExecutor.Service,
-          webSocket: yield* WebSocketExecutor.Service,
-        }),
-      )
-      return Service.of({ prepare: prepareWith as Interface["prepare"], stream, generate: generateWith(stream) })
-    }),
-  )
-
-export const Route = { make, model } as const
+export const Route = { make } as const

 export const LLMClient = {
  Service,
  layer,
-  layerWithWebSocket,
  prepare,
  stream,
  generate,
--- a/packages/llm/src/route/endpoint.ts
+++ b/packages/llm/src/route/endpoint.ts
@@ -11,28 +11,42 @@ export type EndpointPart<Body> = string | ((input: EndpointInput<Body>) => strin
 /**
 * Declarative URL construction for one route.
 *
- * `Endpoint` carries only the path. The host always lives on `model.baseURL`,
- * supplied by the provider helper that constructs the model. `render(...)`
- * just appends the path (and any `model.queryParams`) to that host.
+ * `Endpoint` carries URL construction for one route. Routes with a canonical
+ * host put `baseURL` here; provider helpers can override it by configuring the
+ * route before selecting a model.
 *
 * `path` may be a string or a function of `EndpointInput`, for routes whose
 * URL embeds the model id, region, or another body field (e.g. Bedrock,
 * Gemini).
 */
 export interface Endpoint<Body> {
+  readonly baseURL?: string
  readonly path: EndpointPart<Body>
+  readonly query?: Record<string, string>
 }

+export type EndpointPatch<Body> = Partial<Endpoint<Body>>
+
 /** Construct an `Endpoint` from a path string or path function. */
-export const path = <Body>(value: EndpointPart<Body>): Endpoint<Body> => ({ path: value })
+export const path = <Body>(value: EndpointPart<Body>, options: Omit<Endpoint<Body>, "path"> = {}): Endpoint<Body> => ({
+  ...options,
+  path: value,
+})
+
+export const merge = <Body>(base: Endpoint<Body>, patch: EndpointPatch<Body>): Endpoint<Body> => ({
+  ...base,
+  ...patch,
+  baseURL: patch.baseURL ?? base.baseURL,
+  path: patch.path ?? base.path,
+  query: patch.query === undefined ? base.query : { ...base.query, ...patch.query },
+})

 const renderPart = <Body>(part: EndpointPart<Body>, input: EndpointInput<Body>) =>
  typeof part === "function" ? part(input) : part

 export const render = <Body>(endpoint: Endpoint<Body>, input: EndpointInput<Body>) => {
-  const url = new URL(`${ProviderShared.trimBaseUrl(input.request.model.baseURL)}${renderPart(endpoint.path, input)}`)
-  const params = input.request.model.queryParams
-  if (params) for (const [key, value] of Object.entries(params)) url.searchParams.set(key, value)
+  const url = new URL(`${ProviderShared.trimBaseUrl(endpoint.baseURL ?? "")}${renderPart(endpoint.path, input)}`)
+  for (const [key, value] of Object.entries(endpoint.query ?? {})) url.searchParams.set(key, value)
  return url
 }

--- a/packages/llm/src/route/index.ts
+++ b/packages/llm/src/route/index.ts
@@ -1,14 +1,13 @@
-export { Route, LLMClient, modelLimits, modelRef } from "./client"
+export { Route, LLMClient } from "./client"
 export type {
  Route as RouteShape,
-  RouteModelDefaults,
  RouteModelInput,
-  RouteRoutedModelDefaults,
  RouteRoutedModelInput,
+  RouteDefaults,
+  RouteDefaultsInput,
  AnyRoute,
  Interface as LLMClientShape,
  Service as LLMClientService,
-  ModelRefInput,
 } from "./client"
 export * from "./executor"
 export { Auth } from "./auth"
--- a/packages/llm/src/route/transport/http.ts
+++ b/packages/llm/src/route/transport/http.ts
@@ -1,20 +1,13 @@
 import { Effect, Stream } from "effect"
 import { Headers, HttpClientRequest } from "effect/unstable/http"
-import { Auth, type Auth as AuthDef } from "../auth"
-import { type Endpoint, render as renderEndpoint } from "../endpoint"
-import type { Framing } from "../framing"
-import type { Transport } from "./index"
+import { Auth } from "../auth"
+import { render as renderEndpoint } from "../endpoint"
+import { Framing, type Framing as FramingDef } from "../framing"
+import type { Transport, TransportPrepareInput } from "./index"
 import * as ProviderShared from "../../protocols/shared"
 import { mergeJsonRecords, type LLMRequest } from "../../schema"

-export interface JsonRequestInput<Body> {
-  readonly body: Body
-  readonly request: LLMRequest
-  readonly endpoint: Endpoint<Body>
-  readonly auth: AuthDef
-  readonly encodeBody: (body: Body) => string
-  readonly headers?: (input: { readonly request: LLMRequest }) => Record<string, string>
-}
+export type JsonRequestInput<Body> = TransportPrepareInput<Body>

 export interface JsonRequestParts<Body = unknown> {
  readonly url: string
@@ -25,7 +18,7 @@ export interface JsonRequestParts<Body = unknown> {

 export interface HttpPrepared<Frame> {
  readonly request: HttpClientRequest.HttpClientRequest
-  readonly framing: Framing<Frame>
+  readonly framing: FramingDef<Frame>
 }

 const applyQuery = (url: string, query: Record<string, string> | undefined) => {
@@ -52,28 +45,21 @@ export const jsonRequestParts = <Body>(input: JsonRequestInput<Body>) =>
      input.request.http?.query,
    )
    const body = yield* bodyWithOverlay(input.body, input.request, input.encodeBody)
-    const headers = yield* Auth.toEffect(Auth.isAuth(input.request.model.auth) ? input.request.model.auth : input.auth)(
-      {
-        request: input.request,
-        method: "POST",
-        url,
-        body: body.bodyText,
-        headers: Headers.fromInput({
-          ...(input.headers?.({ request: input.request }) ?? {}),
-          ...input.request.model.headers,
-          ...input.request.http?.headers,
-        }),
-      },
-    )
+    const headers = yield* Auth.toEffect(input.auth)({
+      request: input.request,
+      method: "POST",
+      url,
+      body: body.bodyText,
+      headers: Headers.fromInput({
+        ...input.headers?.({ request: input.request }),
+        ...input.request.http?.headers,
+      }),
+    })
    return { url, jsonBody: body.jsonBody, bodyText: body.bodyText, headers }
  })

-export interface HttpJsonInput<Body, Frame> {
-  readonly endpoint: Endpoint<Body>
-  readonly auth?: AuthDef
-  readonly framing: Framing<Frame>
-  readonly encodeBody: (body: Body) => string
-  readonly headers?: (input: { readonly request: LLMRequest }) => Record<string, string>
+export interface HttpJsonInput<_Body, Frame> {
+  readonly framing: FramingDef<Frame>
 }

 export type HttpJsonPatch<Body, Frame> = Partial<HttpJsonInput<Body, Frame>>
@@ -85,14 +71,9 @@ export interface HttpJsonTransport<Body, Frame> extends Transport<Body, HttpPrep
 export const httpJson = <Body, Frame>(input: HttpJsonInput<Body, Frame>): HttpJsonTransport<Body, Frame> => ({
  id: "http-json",
  with: (patch) => httpJson({ ...input, ...patch }),
-  prepare: (body, request) =>
+  prepare: (prepareInput) =>
    jsonRequestParts({
-      body,
-      request,
-      endpoint: input.endpoint,
-      auth: input.auth ?? Auth.bearer(),
-      encodeBody: input.encodeBody,
-      headers: input.headers,
+      ...prepareInput,
    }).pipe(
      Effect.map((parts) => ({
        request: ProviderShared.jsonPost({ url: parts.url, body: parts.bodyText, headers: parts.headers }),
@@ -109,8 +90,8 @@ export const httpJson = <Body, Frame>(input: HttpJsonInput<Body, Frame>): HttpJs
              response.stream.pipe(
                Stream.mapError((error) =>
                  ProviderShared.eventError(
-                    `${request.model.provider}/${request.model.route}`,
-                    `Failed to read ${request.model.provider}/${request.model.route} stream`,
+                    `${request.model.provider}/${request.model.route.id}`,
+                    `Failed to read ${request.model.provider}/${request.model.route.id} stream`,
                    ProviderShared.errorText(error),
                  ),
                ),
@@ -120,3 +101,8 @@ export const httpJson = <Body, Frame>(input: HttpJsonInput<Body, Frame>): HttpJs
        ),
    ),
 })
+
+export const sseJson = {
+  id: "http-json/sse",
+  with: <Body>() => httpJson<Body, string>({ framing: Framing.sse }),
+} as const
--- a/packages/llm/src/route/transport/index.ts
+++ b/packages/llm/src/route/transport/index.ts
@@ -1,4 +1,6 @@
 import type { Effect, Stream } from "effect"
+import type { Endpoint } from "../endpoint"
+import type { Auth } from "../auth"
 import type { Interface as RequestExecutorInterface } from "../executor"
 import type { Interface as WebSocketExecutorInterface } from "./websocket"
 import type { LLMError, LLMRequest } from "../../schema"
@@ -10,7 +12,7 @@ export interface TransportRuntime {

 export interface Transport<Body, Prepared, Frame> {
  readonly id: string
-  readonly prepare: (body: Body, request: LLMRequest) => Effect.Effect<Prepared, LLMError>
+  readonly prepare: (input: TransportPrepareInput<Body>) => Effect.Effect<Prepared, LLMError>
  readonly frames: (
    prepared: Prepared,
    request: LLMRequest,
@@ -18,5 +20,14 @@ export interface Transport<Body, Prepared, Frame> {
  ) => Stream.Stream<Frame, LLMError>
 }

+export interface TransportPrepareInput<Body> {
+  readonly body: Body
+  readonly request: LLMRequest
+  readonly endpoint: Endpoint<Body>
+  readonly auth: Auth
+  readonly encodeBody: (body: Body) => string
+  readonly headers?: (input: { readonly request: LLMRequest }) => Record<string, string>
+}
+
 export * as HttpTransport from "./http"
 export { WebSocketExecutor, WebSocketTransport } from "./websocket"
--- a/packages/llm/src/route/transport/websocket.ts
+++ b/packages/llm/src/route/transport/websocket.ts
@@ -1,7 +1,6 @@
-import { Cause, Context, Effect, Queue, Stream } from "effect"
+import { Cause, Context, Effect, Layer, Queue, Stream } from "effect"
 import { Headers } from "effect/unstable/http"
-import { Auth, type Auth as AuthDef } from "../auth"
-import type { Endpoint } from "../endpoint"
+import { Auth } from "../auth"
 import { LLMError, TransportReason, type LLMRequest } from "../../schema"
 import * as HttpTransport from "./http"
 import type { Transport } from "./index"
@@ -135,6 +134,8 @@ export const open = (input: WebSocketRequest) =>
      }),
  }).pipe(Effect.flatMap((ws) => fromWebSocket(ws, input)))

+export const layer: Layer.Layer<Service> = Layer.succeed(Service, Service.of({ open }))
+
 export const fromWebSocket = (
  ws: globalThis.WebSocket,
  input: WebSocketRequest,
@@ -213,12 +214,8 @@ export interface JsonPrepared {
 }

 export interface JsonInput<Body, Message> {
-  readonly endpoint: Endpoint<Body>
-  readonly auth?: AuthDef
-  readonly encodeBody: (body: Body) => string
  readonly toMessage: (body: Body | Record<string, unknown>) => Effect.Effect<Message, LLMError>
  readonly encodeMessage: (message: Message) => string
-  readonly headers?: (input: { readonly request: LLMRequest }) => Record<string, string>
 }

 export type JsonPatch<Body, Message> = Partial<JsonInput<Body, Message>>
@@ -230,15 +227,10 @@ export interface JsonTransport<Body, Message> extends Transport<Body, JsonPrepar
 export const json = <Body, Message>(input: JsonInput<Body, Message>): JsonTransport<Body, Message> => ({
  id: "websocket-json",
  with: (patch) => json({ ...input, ...patch }),
-  prepare: (body, request) =>
+  prepare: (prepareInput) =>
    Effect.gen(function* () {
      const parts = yield* HttpTransport.jsonRequestParts({
-        body,
-        request,
-        endpoint: input.endpoint,
-        auth: input.auth ?? Auth.bearer(),
-        encodeBody: input.encodeBody,
-        headers: input.headers,
+        ...prepareInput,
      })
      return {
        url: yield* webSocketUrl(parts.url),
@@ -270,8 +262,14 @@ export const json = <Body, Message>(input: JsonInput<Body, Message>): JsonTransp
  },
 })

+export const jsonTransport = {
+  id: "websocket-json",
+  with: json,
+} as const
+
 export const WebSocketExecutor = {
  Service,
+  layer,
  open,
  fromWebSocket,
  messageText,
@@ -279,4 +277,5 @@ export const WebSocketExecutor = {

 export const WebSocketTransport = {
  json,
+  jsonTransport,
 } as const
--- a/packages/llm/src/schema/events.ts
+++ b/packages/llm/src/schema/events.ts
@@ -1,6 +1,6 @@
 import { Schema } from "effect"
 import { ContentBlockID, FinishReason, ProtocolID, ProviderMetadata, RouteID, ToolCallID } from "./ids"
-import { ModelRef } from "./options"
+import { ModelSchema } from "./options"
 import { ToolResultValue } from "./messages"

 /**
@@ -290,7 +290,7 @@ export class PreparedRequest extends Schema.Class<PreparedRequest>("LLM.Prepared
  id: Schema.String,
  route: RouteID,
  protocol: ProtocolID,
-  model: ModelRef,
+  model: ModelSchema,
  body: Schema.Unknown,
  metadata: Schema.optional(Schema.Record(Schema.String, Schema.Unknown)),
 }) {}
--- a/packages/llm/src/schema/messages.ts
+++ b/packages/llm/src/schema/messages.ts
@@ -1,9 +1,7 @@
 import { Schema } from "effect"
 import { JsonSchema, MessageRole, ProviderMetadata } from "./ids"
-import { CacheHint, CachePolicy, GenerationOptions, HttpOptions, ModelRef, ProviderOptions } from "./options"
-
-const isRecord = (value: unknown): value is Record<string, unknown> =>
-  typeof value === "object" && value !== null && !Array.isArray(value)
+import { CacheHint, CachePolicy, GenerationOptions, HttpOptions, ModelSchema, ProviderOptions } from "./options"
+import { isRecord } from "../utils/record"

 const systemPartSchema = Schema.Struct({
  type: Schema.Literal("text"),
@@ -41,17 +39,49 @@ export const MediaPart = Schema.Struct({
 }).annotate({ identifier: "LLM.Content.Media" })
 export type MediaPart = Schema.Schema.Type<typeof MediaPart>

+export const ToolResultMediaPart = Schema.Struct({
+  type: Schema.Literal("media"),
+  mediaType: Schema.String,
+  data: Schema.String,
+  filename: Schema.optional(Schema.String),
+  metadata: Schema.optional(Schema.Record(Schema.String, Schema.Unknown)),
+}).annotate({ identifier: "LLM.ToolResult.Media" })
+export type ToolResultMediaPart = Schema.Schema.Type<typeof ToolResultMediaPart>
+
+export const ToolResultContentPart = Schema.Union([TextPart, ToolResultMediaPart])
+export type ToolResultContentPart = Schema.Schema.Type<typeof ToolResultContentPart>
+
 const isToolResultValue = (value: unknown): value is ToolResultValue =>
-  isRecord(value) && (value.type === "text" || value.type === "json" || value.type === "error") && "value" in value
+  isRecord(value) &&
+  (value.type === "text" || value.type === "json" || value.type === "error" || value.type === "content") &&
+  "value" in value

 export const ToolResultValue = Object.assign(
-  Schema.Struct({
-    type: Schema.Literals(["json", "text", "error"]),
-    value: Schema.Unknown,
-  }).annotate({ identifier: "LLM.ToolResult" }),
+  Schema.Union([
+    Schema.Struct({
+      type: Schema.Literal("json"),
+      value: Schema.Unknown,
+    }),
+    Schema.Struct({
+      type: Schema.Literal("text"),
+      value: Schema.Unknown,
+    }),
+    Schema.Struct({
+      type: Schema.Literal("error"),
+      value: Schema.Unknown,
+    }),
+    Schema.Struct({
+      type: Schema.Literal("content"),
+      value: Schema.Array(ToolResultContentPart),
+    }),
+  ]).annotate({ identifier: "LLM.ToolResult" }),
  {
-    make: (value: unknown, type: ToolResultValue["type"] = "json"): ToolResultValue =>
-      isToolResultValue(value) ? value : { type, value },
+    is: isToolResultValue,
+    make: (value: unknown, type: ToolResultValue["type"] = "json"): ToolResultValue => {
+      if (isToolResultValue(value)) return value
+      if (type === "content") return { type, value: Array.isArray(value) ? value : [] }
+      return { type, value }
+    },
  },
 )
 export type ToolResultValue = Schema.Schema.Type<typeof ToolResultValue>
@@ -197,7 +227,7 @@ export type ResponseFormat = Schema.Schema.Type<typeof ResponseFormat>

 export class LLMRequest extends Schema.Class<LLMRequest>("LLM.Request")({
  id: Schema.optional(Schema.String),
-  model: ModelRef,
+  model: ModelSchema,
  system: Schema.Array(SystemPart),
  messages: Schema.Array(Message),
  tools: Schema.Array(ToolDefinition),
--- a/packages/llm/src/schema/options.ts
+++ b/packages/llm/src/schema/options.ts
@@ -1,8 +1,7 @@
 import { Schema } from "effect"
-import { JsonSchema, ModelID, ProviderID, RouteID } from "./ids"
-
-const isRecord = (value: unknown): value is Record<string, unknown> =>
-  typeof value === "object" && value !== null && !Array.isArray(value)
+import { JsonSchema, ModelID, ProviderID } from "./ids"
+import type { AnyRoute } from "../route/client"
+import { isRecord } from "../utils/record"

 export const mergeJsonRecords = (
  ...items: ReadonlyArray<Record<string, unknown> | undefined>
@@ -135,67 +134,59 @@ export namespace ModelLimits {
    input instanceof ModelLimits ? input : new ModelLimits(input ?? {})
 }

-export class ModelRef extends Schema.Class<ModelRef>("LLM.ModelRef")({
-  id: ModelID,
-  provider: ProviderID,
-  route: RouteID,
-  baseURL: Schema.String,
-  /** Provider-specific API key convenience. Provider helpers normalize this into `auth`. */
-  apiKey: Schema.optional(Schema.String),
-  /** Optional transport auth policy. Opaque because it may contain functions. */
-  auth: Schema.optional(Schema.Any),
-  headers: Schema.optional(Schema.Record(Schema.String, Schema.String)),
-  /**
-   * Query params appended to the request URL by `Endpoint.baseURL`. Used for
-   * deployment-level URL-scoped settings such as Azure's `api-version` or any
-   * provider that requires a per-request key in the URL. Generic concern, so
-   * lives as a typed first-class field instead of `native`.
-   */
-  queryParams: Schema.optional(Schema.Record(Schema.String, Schema.String)),
-  limits: ModelLimits,
-  /** Provider-neutral generation defaults. Request-level values override them. */
-  generation: Schema.optional(GenerationOptions),
-  /** Provider-owned typed-at-the-facade options for non-portable knobs. */
-  providerOptions: Schema.optional(ProviderOptions),
-  /** Serializable raw HTTP overlays applied to the final outgoing request. */
-  http: Schema.optional(HttpOptions),
-  /**
-   * Provider-specific opaque options. Reach for this only when the value is
-   * genuinely provider-private and does not fit a typed axis (e.g. Bedrock's
-   * `aws_credentials` / `aws_region` for SigV4). Anything used by more than
-   * one route should grow into a typed field instead.
-   */
-  native: Schema.optional(Schema.Record(Schema.String, Schema.Unknown)),
-}) {}
+export class Model {
+  readonly id: ModelID
+  readonly provider: ProviderID
+  readonly route: AnyRoute

-export namespace ModelRef {
-  export type Input = ConstructorParameters<typeof ModelRef>[0]
+  constructor(input: Model.ConstructorInput) {
+    this.id = input.id
+    this.provider = input.provider
+    this.route = input.route
+  }

-  export const input = (model: ModelRef): Input => ({
-    id: model.id,
-    provider: model.provider,
-    route: model.route,
-    baseURL: model.baseURL,
-    apiKey: model.apiKey,
-    auth: model.auth,
-    headers: model.headers,
-    queryParams: model.queryParams,
-    limits: model.limits,
-    generation: model.generation,
-    providerOptions: model.providerOptions,
-    http: model.http,
-    native: model.native,
-  })
+  static make(input: Model.Input) {
+    return new Model({
+      id: ModelID.make(input.id),
+      provider: ProviderID.make(input.provider),
+      route: input.route,
+    })
+  }

-  export const update = (model: ModelRef, patch: Partial<Input>) => {
+  static input(model: Model): Model.ConstructorInput {
+    return {
+      id: model.id,
+      provider: model.provider,
+      route: model.route,
+    }
+  }
+
+  static update(model: Model, patch: Partial<Model.Input>) {
    if (Object.keys(patch).length === 0) return model
-    return new ModelRef({
-      ...input(model),
+    return Model.make({
+      ...Model.input(model),
      ...patch,
    })
  }
 }

+export namespace Model {
+  export type ConstructorInput = {
+    readonly id: ModelID
+    readonly provider: ProviderID
+    readonly route: AnyRoute
+  }
+
+  export type Input = Omit<ConstructorInput, "id" | "provider"> & {
+    readonly id: string | ModelID
+    readonly provider: string | ProviderID
+  }
+}
+
+export type ModelInput = Model.Input
+
+export const ModelSchema = Schema.declare((value): value is Model => value instanceof Model, { expected: "LLM.Model" })
+
 export class CacheHint extends Schema.Class<CacheHint>("LLM.CacheHint")({
  type: Schema.Literals(["ephemeral", "persistent"]),
  ttlSeconds: Schema.optional(Schema.Number),
--- a/packages/llm/src/tool-runtime.ts
+++ b/packages/llm/src/tool-runtime.ts
@@ -11,7 +11,8 @@ import {
  ToolCallPart,
  ToolFailure,
  ToolResultPart,
-  type ToolResultValue,
+  ToolResultValue,
+  type ToolResultValue as ToolResultValueType,
  Usage,
 } from "./schema"
 import { type AnyTool, type ExecutableTools, type Tools, toDefinitions } from "./tool"
@@ -276,7 +277,10 @@ const appendStreamingText = (
  state.assistantContent.push({ type, text, providerMetadata })
 }

-const dispatch = (tools: Tools, call: ToolCallPart): Effect.Effect<{ result: ToolResultValue; error?: unknown }> => {
+const dispatch = (
+  tools: Tools,
+  call: ToolCallPart,
+): Effect.Effect<{ result: ToolResultValueType; error?: unknown }> => {
  const tool = tools[call.name]
  if (!tool) return Effect.succeed({ result: { type: "error" as const, value: `Unknown tool: ${call.name}` } })
  if (!tool.execute)
@@ -285,7 +289,7 @@ const dispatch = (tools: Tools, call: ToolCallPart): Effect.Effect<{ result: Too
  return decodeAndExecute(tool, call).pipe(
    Effect.catchTag("LLM.ToolFailure", (failure) =>
      Effect.succeed({
-        result: { type: "error" as const, value: failure.message } satisfies ToolResultValue,
+        result: { type: "error" as const, value: failure.message } satisfies ToolResultValueType,
        error: failure.error,
      }),
    ),
@@ -293,7 +297,7 @@ const dispatch = (tools: Tools, call: ToolCallPart): Effect.Effect<{ result: Too
  )
 }

-const decodeAndExecute = (tool: AnyTool, call: ToolCallPart): Effect.Effect<ToolResultValue, ToolFailure> =>
+const decodeAndExecute = (tool: AnyTool, call: ToolCallPart): Effect.Effect<ToolResultValueType, ToolFailure> =>
  tool._decode(call.input).pipe(
    Effect.mapError((error) => new ToolFailure({ message: `Invalid tool input: ${error.message}` })),
    Effect.flatMap((decoded) => tool.execute!(decoded, { id: call.id, name: call.name })),
@@ -307,10 +311,12 @@ const decodeAndExecute = (tool: AnyTool, call: ToolCallPart): Effect.Effect<Tool
        ),
      ),
    ),
-    Effect.map((encoded): ToolResultValue => ({ type: "json", value: encoded })),
+    Effect.map(
+      (encoded): ToolResultValueType => (ToolResultValue.is(encoded) ? encoded : { type: "json", value: encoded }),
+    ),
  )

-const emitEvents = (call: ToolCallPart, result: ToolResultValue, error: unknown): ReadonlyArray<LLMEvent> =>
+const emitEvents = (call: ToolCallPart, result: ToolResultValueType, error: unknown): ReadonlyArray<LLMEvent> =>
  result.type === "error"
    ? [
        LLMEvent.toolError({ id: call.id, name: call.name, message: String(result.value), error }),
@@ -321,7 +327,7 @@ const emitEvents = (call: ToolCallPart, result: ToolResultValue, error: unknown)
 const followUpRequest = (
  request: LLMRequest,
  state: StepState,
-  dispatched: ReadonlyArray<readonly [ToolCallPart, ToolResultValue]>,
+  dispatched: ReadonlyArray<readonly [ToolCallPart, ToolResultValueType]>,
 ) =>
  LLMRequest.update(request, {
    messages: [
--- a/packages/llm/src/utils/record.ts
+++ b/packages/llm/src/utils/record.ts
@@ -0,0 +1,3 @@
+/** Plain-record narrowing. Excludes arrays so JSON object checks don't accept tuples as key/value bags. */
+export const isRecord = (value: unknown): value is Record<string, unknown> =>
+  typeof value === "object" && value !== null && !Array.isArray(value)
--- a/packages/llm/test/adapter.test.ts
+++ b/packages/llm/test/adapter.test.ts
@@ -1,12 +1,12 @@
 import { describe, expect } from "bun:test"
 import { Effect, Schema, Stream } from "effect"
 import { LLM } from "../src"
-import { Route, Endpoint, LLMClient, Protocol, type RouteModelInput, type FramingDef } from "../src/route"
-import { ModelRef } from "../src/schema"
+import { Route, Endpoint, LLMClient, Protocol, type FramingDef } from "../src/route"
+import { Model } from "../src/schema"
 import { testEffect } from "./lib/effect"
 import { dynamicResponse } from "./lib/http"

-const updateModel = (model: ModelRef, patch: Partial<ModelRef.Input>) => ModelRef.update(model, patch)
+const updateModel = (model: Model, patch: Partial<Model.Input>) => Model.update(model, patch)

 const Json = Schema.fromJsonString(Schema.Unknown)
 const encodeJson = Schema.encodeSync(Json)
@@ -38,17 +38,6 @@ const fakeFraming: FramingDef<FakeEvent> = {
    ).pipe(Stream.flatMap(Stream.fromIterable)),
 }

-const request = LLM.request({
-  id: "req_1",
-  model: LLM.model({
-    id: "fake-model",
-    provider: "fake-provider",
-    route: "fake",
-    baseURL: "https://fake.local",
-  }),
-  prompt: "hello",
-})
-
 const raiseEvent = (event: FakeEvent): import("../src/schema").LLMEvent =>
  event.type === "finish"
    ? { type: "finish", reason: event.reason }
@@ -84,6 +73,7 @@ const fake = Route.make({
  endpoint: Endpoint.path("/chat"),
  framing: fakeFraming,
 })
+const configuredFake = fake.with({ endpoint: { baseURL: "https://fake.local" } })

 const gemini = Route.make({
  id: "gemini-fake",
@@ -91,6 +81,17 @@ const gemini = Route.make({
  endpoint: Endpoint.path("/chat"),
  framing: fakeFraming,
 })
+const configuredGemini = gemini.with({ endpoint: { baseURL: "https://fake.local" } })
+
+const request = LLM.request({
+  id: "req_1",
+  model: Model.make({
+    id: "fake-model",
+    provider: "fake-provider",
+    route: configuredFake,
+  }),
+  prompt: "hello",
+})

 const echoLayer = dynamicResponse(({ text, respond }) =>
  Effect.succeed(
@@ -117,61 +118,47 @@ describe("llm route", () => {
    }),
  )

-  it.effect("selects routes by request route", () =>
+  it.effect("selects routes by model route value", () =>
    Effect.gen(function* () {
      const llm = yield* LLMClient.Service
      const prepared = yield* llm.prepare(
-        LLM.updateRequest(request, { model: updateModel(request.model, { route: "gemini-fake" }) }),
+        LLM.updateRequest(request, { model: updateModel(request.model, { route: configuredGemini }) }),
      )

      expect(prepared.route).toBe("gemini-fake")
    }),
  )

-  it.effect("maps model input before building refs", () =>
+  it.effect("builds models from configured routes", () =>
    Effect.gen(function* () {
-      const mapped = Route.model<RouteModelInput & { readonly region?: string }>(
-        fake,
-        { provider: "fake-provider", baseURL: "https://fake.local" },
-        {
-          mapInput: (input) => {
-            const { region, ...rest } = input
-            return { ...rest, native: { region } }
+      const configured = fake.with({ provider: "fake-provider", endpoint: { baseURL: "https://fake.local" } })
+
+      expect(configured.model({ id: "fake-model" })).toMatchObject({
+        provider: "fake-provider",
+      })
+    }),
+  )
+
+  it.effect("does not register duplicate route ids globally", () =>
+    Effect.gen(function* () {
+      const duplicate = Route.make({
+        id: "fake",
+        protocol: Protocol.make({
+          ...fakeProtocol,
+          body: {
+            ...fakeProtocol.body,
+            from: () => Effect.succeed({ body: "late-default" }),
          },
-        },
+        }),
+        endpoint: Endpoint.path("/chat", { baseURL: "https://fake.local" }),
+        framing: fakeFraming,
+      })
+
+      const prepared = yield* (yield* LLMClient.Service).prepare(
+        LLM.updateRequest(request, { model: updateModel(request.model, { route: duplicate }) }),
      )

-      expect(mapped({ id: "fake-model", region: "us-east-1" }).native).toEqual({ region: "us-east-1" })
-    }),
-  )
-
-  it.effect("rejects duplicate route ids", () =>
-    Effect.gen(function* () {
-      expect(() =>
-        Route.make({
-          id: "fake",
-          protocol: Protocol.make({
-            ...fakeProtocol,
-            body: {
-              ...fakeProtocol.body,
-              from: () => Effect.succeed({ body: "late-default" }),
-            },
-          }),
-          endpoint: Endpoint.path("/chat"),
-          framing: fakeFraming,
-        }),
-      ).toThrow('Duplicate LLM route id "fake"')
-    }),
-  )
-
-  it.effect("rejects missing route", () =>
-    Effect.gen(function* () {
-      const llm = yield* LLMClient.Service
-      const error = yield* llm
-        .prepare(LLM.updateRequest(request, { model: updateModel(request.model, { route: "missing" }) }))
-        .pipe(Effect.flip)
-
-      expect(error.message).toContain("No LLM route")
+      expect(prepared.body).toEqual({ body: "late-default" })
    }),
  )
 })
--- a/packages/llm/test/auth-options.types.ts
+++ b/packages/llm/test/auth-options.types.ts
@@ -2,8 +2,17 @@ import { Config } from "effect"
 import type { Auth } from "../src/route/auth"
 import type { ModelFactory } from "../src/route/auth-options"
 import { Auth as RuntimeAuth } from "../src/route/auth"
+import * as OpenAIChat from "../src/protocols/openai-chat"
+import * as AmazonBedrock from "../src/providers/amazon-bedrock"
+import * as Anthropic from "../src/providers/anthropic"
 import * as Azure from "../src/providers/azure"
+import * as Cloudflare from "../src/providers/cloudflare"
+import * as GitHubCopilot from "../src/providers/github-copilot"
+import * as Google from "../src/providers/google"
 import * as OpenAI from "../src/providers/openai"
+import * as OpenAICompatible from "../src/providers/openai-compatible"
+import * as OpenRouter from "../src/providers/openrouter"
+import * as XAI from "../src/providers/xai"

 type BaseOptions = {
  readonly baseURL?: string
@@ -19,6 +28,20 @@ declare const optionalAuthModel: ModelFactory<BaseOptions, "optional", Model>
 declare const requiredAuthModel: ModelFactory<BaseOptions, "required", Model>
 const configApiKey = Config.redacted("OPENAI_API_KEY")

+OpenAIChat.route.model({ id: "gpt-4.1-mini" })
+
+// @ts-expect-error route model selection does not configure endpoints.
+OpenAIChat.route.model({ id: "gpt-4.1-mini", baseURL: "https://gateway.example.com/v1" })
+
+// @ts-expect-error route model selection does not configure query params.
+OpenAIChat.route.model({ id: "gpt-4.1-mini", queryParams: { debug: "1" } })
+
+// @ts-expect-error route model selection does not configure auth.
+OpenAIChat.route.model({ id: "gpt-4.1-mini", auth })
+
+// @ts-expect-error route model selection does not configure api keys.
+OpenAIChat.route.model({ id: "gpt-4.1-mini", apiKey: "sk-test" })
+
 optionalAuthModel("gpt-4.1-mini")
 optionalAuthModel("gpt-4.1-mini", {})
 optionalAuthModel("gpt-4.1-mini", { apiKey: "sk-test" })
@@ -45,56 +68,101 @@ requiredAuthModel("custom-model", {})
 requiredAuthModel("custom-model", { apiKey: "key", auth })

 OpenAI.responses("gpt-4.1-mini")
-OpenAI.responses("gpt-4.1-mini", {})
-OpenAI.responses("gpt-4.1-mini", { apiKey: "sk-test" })
-OpenAI.responses("gpt-4.1-mini", { apiKey: configApiKey })
-OpenAI.responses("gpt-4.1-mini", { auth: RuntimeAuth.bearer("oauth-token") })
-OpenAI.responses("gpt-4.1-mini", {
+OpenAI.configure({}).responses("gpt-4.1-mini")
+OpenAI.configure({ apiKey: "sk-test" }).responses("gpt-4.1-mini")
+OpenAI.configure({ apiKey: configApiKey }).responses("gpt-4.1-mini")
+OpenAI.configure({ auth: RuntimeAuth.bearer("oauth-token") }).responses("gpt-4.1-mini")
+OpenAI.configure({
  auth: RuntimeAuth.headers({ authorization: "Bearer gateway" }),
  baseURL: "https://gateway.example.com/v1",
-})
-OpenAI.responses("gpt-4.1-mini", {
+}).responses("gpt-4.1-mini")
+OpenAI.configure({
  generation: { maxTokens: 100 },
  providerOptions: { openai: { store: false } },
-})
+}).responses("gpt-4.1-mini")
+
+// @ts-expect-error OpenAI model selectors only accept model ids.
+OpenAI.configure({ apiKey: "sk-test" }).responses("gpt-4.1-mini", {})

 // @ts-expect-error apiKey only accepts string, Redacted<string>, or Config<string | Redacted<string>>.
-OpenAI.responses("gpt-4.1-mini", { apiKey: 123 })
+OpenAI.configure({ apiKey: 123 })

 // @ts-expect-error provider helpers reject unknown top-level options.
-OpenAI.responses("gpt-4.1-mini", { bogus: true })
+OpenAI.configure({ bogus: true })

 // @ts-expect-error common generation options remain typed.
-OpenAI.responses("gpt-4.1-mini", { generation: { maxTokens: "many" } })
+OpenAI.configure({ generation: { maxTokens: "many" } })

 // @ts-expect-error provider-native options remain typed.
-OpenAI.responses("gpt-4.1-mini", { providerOptions: { openai: { store: "false" } } })
+OpenAI.configure({ providerOptions: { openai: { store: "false" } } })

 // @ts-expect-error auth is an override, so OpenAI rejects apiKey with auth.
-OpenAI.responses("gpt-4.1-mini", { apiKey: "sk-test", auth: RuntimeAuth.bearer("oauth-token") })
+OpenAI.configure({ apiKey: "sk-test", auth: RuntimeAuth.bearer("oauth-token") })

 OpenAI.chat("gpt-4.1-mini")
-OpenAI.chat("gpt-4.1-mini", { apiKey: "sk-test" })
-OpenAI.chat("gpt-4.1-mini", { apiKey: configApiKey })
-OpenAI.chat("gpt-4.1-mini", { auth: RuntimeAuth.bearer("oauth-token") })
+OpenAI.configure({ apiKey: "sk-test" }).chat("gpt-4.1-mini")
+OpenAI.configure({ apiKey: configApiKey }).chat("gpt-4.1-mini")
+OpenAI.configure({ auth: RuntimeAuth.bearer("oauth-token") }).chat("gpt-4.1-mini")
+
+// @ts-expect-error OpenAI chat selectors only accept model ids.
+OpenAI.configure({ apiKey: "sk-test" }).chat("gpt-4.1-mini", {})

 // @ts-expect-error auth is an override, so OpenAI Chat rejects apiKey with auth.
-OpenAI.chat("gpt-4.1-mini", { apiKey: "sk-test", auth: RuntimeAuth.bearer("oauth-token") })
+OpenAI.configure({ apiKey: "sk-test", auth: RuntimeAuth.bearer("oauth-token") })

 // @ts-expect-error Azure requires at least one of `resourceName` or `baseURL`.
-Azure.responses("deployment")
-Azure.responses("deployment", { apiKey: "azure-key", resourceName: "resource" })
-Azure.responses("deployment", { apiKey: configApiKey, resourceName: "resource" })
-Azure.responses("deployment", { auth: RuntimeAuth.header("api-key", "azure-key"), resourceName: "resource" })
+Azure.configure()
+Azure.configure({ apiKey: "azure-key", resourceName: "resource" }).responses("deployment")
+Azure.configure({ apiKey: configApiKey, resourceName: "resource" }).responses("deployment")
+Azure.configure({ auth: RuntimeAuth.header("api-key", "azure-key"), resourceName: "resource" }).responses("deployment")
+
+// @ts-expect-error Azure model selectors only accept deployment ids.
+Azure.configure({ apiKey: "azure-key", resourceName: "resource" }).responses("deployment", {})

 // @ts-expect-error auth is an override, so Azure rejects apiKey with auth.
-Azure.responses("deployment", { apiKey: "azure-key", auth: RuntimeAuth.header("api-key", "override") })
+Azure.configure({ resourceName: "resource", apiKey: "azure-key", auth: RuntimeAuth.header("api-key", "override") })

-// @ts-expect-error Azure requires at least one of `resourceName` or `baseURL`.
-Azure.chat("deployment")
-Azure.chat("deployment", { apiKey: "azure-key", resourceName: "resource" })
-Azure.chat("deployment", { apiKey: configApiKey, resourceName: "resource" })
-Azure.chat("deployment", { auth: RuntimeAuth.header("api-key", "azure-key"), resourceName: "resource" })
+Azure.configure({ apiKey: "azure-key", resourceName: "resource" }).chat("deployment")
+Azure.configure({ apiKey: configApiKey, resourceName: "resource" }).chat("deployment")
+Azure.configure({ auth: RuntimeAuth.header("api-key", "azure-key"), resourceName: "resource" }).chat("deployment")
+
+// @ts-expect-error Azure chat model selectors only accept deployment ids.
+Azure.configure({ apiKey: "azure-key", resourceName: "resource" }).chat("deployment", {})

 // @ts-expect-error auth is an override, so Azure Chat rejects apiKey with auth.
-Azure.chat("deployment", { apiKey: "azure-key", auth: RuntimeAuth.header("api-key", "override") })
+Azure.configure({ resourceName: "resource", apiKey: "azure-key", auth: RuntimeAuth.header("api-key", "override") })
+
+Anthropic.configure({ apiKey: "anthropic-key" }).model("claude-haiku")
+// @ts-expect-error Anthropic model selectors only accept model ids.
+Anthropic.configure({ apiKey: "anthropic-key" }).model("claude-haiku", {})
+
+Google.configure({ apiKey: "google-key" }).model("gemini-2.5-flash")
+// @ts-expect-error Google model selectors only accept model ids.
+Google.configure({ apiKey: "google-key" }).model("gemini-2.5-flash", {})
+
+AmazonBedrock.configure({ apiKey: "bedrock-key" }).model("anthropic.claude")
+// @ts-expect-error Bedrock model selectors only accept model ids.
+AmazonBedrock.configure({ apiKey: "bedrock-key" }).model("anthropic.claude", {})
+
+OpenRouter.configure({ apiKey: "openrouter-key" }).model("openai/gpt-4o-mini")
+// @ts-expect-error OpenRouter model selectors only accept model ids.
+OpenRouter.configure({ apiKey: "openrouter-key" }).model("openai/gpt-4o-mini", {})
+
+XAI.configure({ apiKey: "xai-key" }).responses("grok-4")
+XAI.configure({ apiKey: "xai-key" }).chat("grok-4")
+// @ts-expect-error xAI Responses selectors only accept model ids.
+XAI.configure({ apiKey: "xai-key" }).responses("grok-4", {})
+// @ts-expect-error xAI Chat selectors only accept model ids.
+XAI.configure({ apiKey: "xai-key" }).chat("grok-4", {})
+
+OpenAICompatible.deepseek.configure({ apiKey: "deepseek-key" }).model("deepseek-chat")
+// @ts-expect-error OpenAI-compatible family selectors only accept model ids.
+OpenAICompatible.deepseek.configure({ apiKey: "deepseek-key" }).model("deepseek-chat", {})
+
+Cloudflare.CloudflareWorkersAI.configure({ accountId: "account", apiKey: "cf-key" }).model("@cf/meta/llama")
+// @ts-expect-error Cloudflare Workers AI model selectors only accept model ids.
+Cloudflare.CloudflareWorkersAI.configure({ accountId: "account", apiKey: "cf-key" }).model("@cf/meta/llama", {})
+
+GitHubCopilot.configure({ baseURL: "https://copilot.test", apiKey: "copilot-key" }).model("gpt-4.1")
+// @ts-expect-error GitHub Copilot model selectors only accept model ids.
+GitHubCopilot.configure({ baseURL: "https://copilot.test", apiKey: "copilot-key" }).model("gpt-4.1", {})
--- a/packages/llm/test/auth.test.ts
+++ b/packages/llm/test/auth.test.ts
@@ -3,11 +3,13 @@ import { ConfigProvider, Effect } from "effect"
 import { Headers } from "effect/unstable/http"
 import { LLM } from "../src"
 import { Auth } from "../src/route/auth"
+import * as OpenAIChat from "../src/protocols/openai-chat"
+import { Model } from "../src/schema"
 import { it } from "./lib/effect"

 const request = LLM.request({
  id: "req_auth",
-  model: LLM.model({ id: "fake-model", provider: "fake", route: "fake", baseURL: "https://fake.local" }),
+  model: Model.make({ id: "fake-model", provider: "fake", route: OpenAIChat.route }),
  prompt: "hello",
 })

--- a/packages/llm/test/cache-policy.test.ts
+++ b/packages/llm/test/cache-policy.test.ts
@@ -1,36 +1,32 @@
 import { describe, expect, test } from "bun:test"
 import { Effect } from "effect"
 import { CacheHint, LLM, Message } from "../src"
-import { LLMClient } from "../src/route"
+import { Auth, LLMClient } from "../src/route"
+import { AmazonBedrock } from "../src/providers"
 import * as AnthropicMessages from "../src/protocols/anthropic-messages"
-import * as BedrockConverse from "../src/protocols/bedrock-converse"
 import * as Gemini from "../src/protocols/gemini"
 import * as OpenAIChat from "../src/protocols/openai-chat"
 import { applyCachePolicy } from "../src/cache-policy"
 import { it } from "./lib/effect"

-const anthropicModel = AnthropicMessages.model({
-  id: "claude-sonnet-4-5",
-  baseURL: "https://api.anthropic.test/v1/",
-  headers: { "x-api-key": "test" },
-})
+const anthropicModel = AnthropicMessages.route
+  .with({ endpoint: { baseURL: "https://api.anthropic.test/v1/" }, auth: Auth.header("x-api-key", "test") })
+  .model({ id: "claude-sonnet-4-5" })

-const bedrockModel = BedrockConverse.model({
-  id: "anthropic.claude-3-5-sonnet-20241022-v2:0",
+const bedrockModel = AmazonBedrock.configure({
  credentials: { region: "us-east-1", accessKeyId: "fixture", secretAccessKey: "fixture" },
-})
+}).model("anthropic.claude-3-5-sonnet-20241022-v2:0")

-const openaiModel = OpenAIChat.model({
-  id: "gpt-4o-mini",
-  baseURL: "https://api.openai.test/v1/",
-  headers: { authorization: "Bearer test" },
-})
+const openaiModel = OpenAIChat.route
+  .with({ endpoint: { baseURL: "https://api.openai.test/v1/" }, auth: Auth.bearer("test") })
+  .model({ id: "gpt-4o-mini" })

-const geminiModel = Gemini.model({
-  id: "gemini-2.5-flash",
-  baseURL: "https://generativelanguage.test/v1beta/",
-  headers: { "x-goog-api-key": "test" },
-})
+const geminiModel = Gemini.route
+  .with({
+    endpoint: { baseURL: "https://generativelanguage.test/v1beta/" },
+    auth: Auth.header("x-goog-api-key", "test"),
+  })
+  .model({ id: "gemini-2.5-flash" })

 describe("applyCachePolicy", () => {
  it.effect("undefined cache resolves to 'auto' (the recommended default)", () =>
--- a/packages/llm/test/endpoint.test.ts
+++ b/packages/llm/test/endpoint.test.ts
@@ -1,37 +1,40 @@
 import { describe, expect, test } from "bun:test"
 import { LLM } from "../src"
+import * as OpenAIChat from "../src/protocols/openai-chat"
 import { Endpoint } from "../src/route"
+import { Model } from "../src/schema"

-const request = (input: { readonly baseURL: string; readonly queryParams?: Record<string, string> }) =>
+const request = () =>
  LLM.request({
-    model: LLM.model({
+    model: Model.make({
      id: "model-1",
      provider: "test",
-      route: "test-route",
-      baseURL: input.baseURL,
-      queryParams: input.queryParams,
+      route: OpenAIChat.route,
    }),
    prompt: "hello",
  })

 describe("Endpoint", () => {
  test("appends a static path to the model's baseURL", () => {
-    const url = Endpoint.render(Endpoint.path("/chat"), {
-      request: request({ baseURL: "https://api.example.test/v1/" }),
+    const url = Endpoint.render(Endpoint.path("/chat", { baseURL: "https://api.example.test/v1/" }), {
+      request: request(),
      body: {},
    })

    expect(url.toString()).toBe("https://api.example.test/v1/chat")
  })

-  test("model query params are appended to the rendered URL", () => {
-    const url = Endpoint.render(Endpoint.path("/chat?alt=sse"), {
-      request: request({
+  test("endpoint query params are appended to the rendered URL", () => {
+    const url = Endpoint.render(
+      Endpoint.path("/chat?alt=sse", {
        baseURL: "https://custom.example.test/root/",
-        queryParams: { "api-version": "2026-01-01", alt: "json" },
+        query: { "api-version": "2026-01-01", alt: "json" },
      }),
-      body: {},
-    })
+      {
+        request: request(),
+        body: {},
+      },
+    )

    expect(url.toString()).toBe("https://custom.example.test/root/chat?alt=json&api-version=2026-01-01")
  })
@@ -40,9 +43,10 @@ describe("Endpoint", () => {
    const url = Endpoint.render(
      Endpoint.path<{ readonly modelId: string }>(
        ({ body }) => `/model/${encodeURIComponent(body.modelId)}/converse-stream`,
+        { baseURL: "https://bedrock-runtime.us-east-1.amazonaws.com" },
      ),
      {
-        request: request({ baseURL: "https://bedrock-runtime.us-east-1.amazonaws.com" }),
+        request: request(),
        body: { modelId: "us.amazon.nova-micro-v1:0" },
      },
    )
--- a/packages/llm/test/executor.test.ts
+++ b/packages/llm/test/executor.test.ts
@@ -106,8 +106,8 @@ describe("RequestExecutor", () => {
      expect(errorHttp(error)?.body).toBe("rate limited")
    }).pipe(
      Effect.provide(
-        responsesLayer([
-          ...Array.from(
+        responsesLayer(
+          Array.from(
            { length: 3 },
            () =>
              new Response("rate limited", {
@@ -115,7 +115,7 @@ describe("RequestExecutor", () => {
                headers: { "retry-after-ms": "0", "x-request-id": "req_123", "x-api-key": "secret" },
              }),
          ),
-        ]),
+        ),
      ),
    ),
  )
@@ -388,7 +388,9 @@ describe("RequestExecutor", () => {
  it.effect("does not retry after a successful response reaches stream parsing", () =>
    Effect.gen(function* () {
      const attempts = yield* Ref.make(0)
-      const model = OpenAIChat.model({ id: "gpt-4o-mini", baseURL: "https://api.openai.test/v1" })
+      const model = OpenAIChat.route
+        .with({ endpoint: { baseURL: "https://api.openai.test/v1" } })
+        .model({ id: "gpt-4o-mini" })
      const error = yield* LLMClient.generate(LLM.request({ model, prompt: "Say hello." })).pipe(
        Effect.provide(
          dynamicResponse((input) =>
--- a/packages/llm/test/exports.test.ts
+++ b/packages/llm/test/exports.test.ts
@@ -2,7 +2,14 @@ import { describe, expect, test } from "bun:test"
 import { LLM, LLMClient, Provider } from "@opencode-ai/llm"
 import { Route, Protocol } from "@opencode-ai/llm/route"
 import { Provider as ProviderSubpath } from "@opencode-ai/llm/provider"
-import { Cloudflare, OpenAI, OpenAICompatible, OpenRouter, XAI } from "@opencode-ai/llm/providers"
+import {
+  CloudflareAIGateway,
+  CloudflareWorkersAI,
+  OpenAI,
+  OpenAICompatible,
+  OpenRouter,
+  XAI,
+} from "@opencode-ai/llm/providers"
 import * as GitHubCopilot from "@opencode-ai/llm/providers/github-copilot"
 import { OpenAIChat, OpenAICompatibleChat, OpenAIResponses } from "@opencode-ai/llm/protocols"
 import * as AnthropicMessages from "@opencode-ai/llm/protocols/anthropic-messages"
@@ -24,26 +31,25 @@ describe("public exports", () => {
  test("provider barrels expose user-facing facades", () => {
    expect(OpenAI.model).toBeFunction()
    expect(OpenAI.provider.model).toBe(OpenAI.model)
-    expect(OpenAI.apis.responses).toBe(OpenAI.responses)
-    expect(OpenAI.apis.responsesWebSocket).toBe(OpenAI.responsesWebSocket)
+    expect(OpenAI.provider.responses).toBe(OpenAI.responses)
+    expect(OpenAI.provider.responsesWebSocket).toBe(OpenAI.responsesWebSocket)
+    expect(OpenAI.configure({ apiKey: "fixture" }).responses).toBeFunction()
    expect(OpenAICompatible.deepseek.model).toBeFunction()
-    expect(Cloudflare.model).toBeFunction()
-    expect(Cloudflare.provider.model).toBe(Cloudflare.model)
-    expect(Cloudflare.aiGateway).toBeFunction()
-    expect(Cloudflare.workersAI).toBeFunction()
+    expect(CloudflareAIGateway.configure).toBeFunction()
+    expect(CloudflareAIGateway.configure({ accountId: "fixture", gatewayApiKey: "fixture" }).model).toBeFunction()
+    expect(CloudflareWorkersAI.configure).toBeFunction()
+    expect(CloudflareWorkersAI.configure({ accountId: "fixture", apiKey: "fixture" }).model).toBeFunction()
    expect(OpenRouter.model).toBeFunction()
    expect(OpenRouter.provider.model).toBe(OpenRouter.model)
    expect(XAI.model).toBeFunction()
    expect(XAI.provider.model).toBe(XAI.model)
-    expect(XAI.apis.responses).toBe(XAI.responses)
-    expect(XAI.apis.chat).toBe(XAI.chat)
-    expect(XAI.responses("grok-4.3", { apiKey: "fixture" })).toMatchObject({
-      route: "openai-responses",
-    })
-    expect(XAI.chat("grok-4.3", { apiKey: "fixture" })).toMatchObject({
-      route: "openai-compatible-chat",
-    })
-    expect(GitHubCopilot.model).toBeFunction()
+    expect(XAI.provider.responses).toBe(XAI.responses)
+    expect(XAI.provider.chat).toBe(XAI.chat)
+    expect(XAI.configure({ apiKey: "fixture" }).responses("grok-4.3").route.id).toBe("openai-responses")
+    expect(XAI.configure({ apiKey: "fixture" }).chat("grok-4.3").route.id).toBe("openai-compatible-chat")
+    expect(
+      GitHubCopilot.configure({ baseURL: "https://api.githubcopilot.test", apiKey: "fixture" }).model,
+    ).toBeFunction()
  })

  test("protocol barrels expose supported low-level routes", () => {
--- a/packages/llm/test/fixtures/media/restroom.png
+++ b/packages/llm/test/fixtures/media/restroom.png
--- a/packages/llm/test/fixtures/recordings/gemini/gemini-2-5-flash-image.json
+++ b/packages/llm/test/fixtures/recordings/gemini/gemini-2-5-flash-image.json
--- a/packages/llm/test/fixtures/recordings/openai-responses-cache/reports-cached-tokens-on-identical-second-call.json
+++ b/packages/llm/test/fixtures/recordings/openai-responses-cache/reports-cached-tokens-on-identical-second-call.json
--- a/packages/llm/test/generate-object.test.ts
+++ b/packages/llm/test/generate-object.test.ts
@@ -2,6 +2,7 @@ import { describe, expect, test } from "bun:test"
 import { Effect, Schema } from "effect"
 import { LLM } from "../src"
 import * as OpenAIChat from "../src/protocols/openai-chat"
+import { Auth } from "../src/route"
 import { Tool, toDefinitions } from "../src/tool"
 import { it } from "./lib/effect"
 import { dynamicResponse } from "./lib/http"
@@ -17,11 +18,9 @@ type OpenAIChatBody = {
  }>
 }

-const model = OpenAIChat.model({
-  id: "gpt-4o-mini",
-  baseURL: "https://api.openai.test/v1/",
-  headers: { authorization: "Bearer test" },
-})
+const model = OpenAIChat.route
+  .with({ endpoint: { baseURL: "https://api.openai.test/v1/" }, auth: Auth.bearer("test") })
+  .model({ id: "gpt-4o-mini" })

 const Json = Schema.fromJsonString(Schema.Unknown)
 const decodeJson = Schema.decodeUnknownSync(Json)
--- a/packages/llm/test/lib/http.ts
+++ b/packages/llm/test/lib/http.ts
@@ -1,8 +1,9 @@
 import { Effect, Layer, Ref } from "effect"
 import { HttpClient, HttpClientRequest, HttpClientResponse } from "effect/unstable/http"
-import { LLMClient, RequestExecutor } from "../../src/route"
+import { LLMClient, RequestExecutor, WebSocketExecutor } from "../../src/route"
 import type { Service as LLMClientService } from "../../src/route/client"
 import type { Service as RequestExecutorService } from "../../src/route/executor"
+import type { Service as WebSocketExecutorService } from "../../src/route/transport/websocket"

 export type HandlerInput = {
  readonly request: HttpClientRequest.HttpClientRequest
@@ -31,12 +32,13 @@ const handlerLayer = (handler: Handler): Layer.Layer<HttpClient.HttpClient> =>
    ),
  )

-export type RuntimeEnv = RequestExecutorService | LLMClientService
+export type RuntimeEnv = RequestExecutorService | WebSocketExecutorService | LLMClientService

 export const runtimeLayer = (layer: Layer.Layer<HttpClient.HttpClient>): Layer.Layer<RuntimeEnv> => {
  const requestExecutorLayer = RequestExecutor.layer.pipe(Layer.provide(layer))
-  const llmClientLayer = LLMClient.layer.pipe(Layer.provide(requestExecutorLayer))
-  return Layer.mergeAll(requestExecutorLayer, llmClientLayer)
+  const deps = Layer.mergeAll(requestExecutorLayer, WebSocketExecutor.layer)
+  const llmClientLayer = LLMClient.layer.pipe(Layer.provide(deps))
+  return Layer.mergeAll(deps, llmClientLayer)
 }

 const SSE_HEADERS = { "content-type": "text/event-stream" } as const
--- a/packages/llm/test/llm.test.ts
+++ b/packages/llm/test/llm.test.ts
@@ -1,18 +1,23 @@
 import { describe, expect, test } from "bun:test"
 import { LLM, LLMResponse } from "../src"
-import { LLMRequest, Message, ModelRef, ToolCallPart, ToolChoice, ToolDefinition, ToolResultPart } from "../src/schema"
+import * as OpenAIChat from "../src/protocols/openai-chat"
+import * as OpenAIResponses from "../src/protocols/openai-responses"
+import { LLMRequest, Message, Model, ToolCallPart, ToolChoice, ToolDefinition, ToolResultPart } from "../src/schema"
+
+const chatRoute = OpenAIChat.route
+const responsesRoute = OpenAIResponses.route

 describe("llm constructors", () => {
  test("builds canonical schema classes from ergonomic input", () => {
    const request = LLM.request({
      id: "req_1",
-      model: LLM.model({ id: "fake-model", provider: "fake", route: "openai-chat", baseURL: "https://fake.local" }),
+      model: Model.make({ id: "fake-model", provider: "fake", route: chatRoute }),
      system: "You are concise.",
      prompt: "Say hello.",
    })

    expect(request).toBeInstanceOf(LLMRequest)
-    expect(request.model).toBeInstanceOf(ModelRef)
+    expect(request.model).toBeInstanceOf(Model)
    expect(request.messages[0]).toBeInstanceOf(Message)
    expect(request.system).toEqual([{ type: "text", text: "You are concise." }])
    expect(request.messages[0]?.content).toEqual([{ type: "text", text: "Say hello." }])
@@ -23,7 +28,7 @@ describe("llm constructors", () => {
  test("updates requests without spreading schema class instances", () => {
    const base = LLM.request({
      id: "req_1",
-      model: LLM.model({ id: "fake-model", provider: "fake", route: "openai-chat", baseURL: "https://fake.local" }),
+      model: Model.make({ id: "fake-model", provider: "fake", route: chatRoute }),
      prompt: "Say hello.",
    })
    const updated = LLM.updateRequest(base, {
@@ -38,16 +43,16 @@ describe("llm constructors", () => {
    expect(updated.messages.map((message) => message.role)).toEqual(["user", "assistant"])
  })

-  test("keeps request options separate from model defaults", () => {
+  test("keeps request options separate from route defaults", () => {
    const request = LLM.request({
-      model: LLM.model({
+      model: Model.make({
        id: "fake-model",
        provider: "fake",
-        route: "openai-chat",
-        baseURL: "https://fake.local",
-        generation: { maxTokens: 100, temperature: 1 },
-        providerOptions: { openai: { store: false, metadata: { model: true } } },
-        http: { body: { metadata: { model: true } }, headers: { "x-shared": "model" }, query: { model: "1" } },
+        route: chatRoute.with({
+          generation: { maxTokens: 100, temperature: 1 },
+          providerOptions: { openai: { store: false, metadata: { model: true } } },
+          http: { body: { metadata: { model: true } }, headers: { "x-shared": "model" }, query: { model: "1" } },
+        }),
      }),
      prompt: "Say hello.",
      generation: { temperature: 0 },
@@ -67,7 +72,7 @@ describe("llm constructors", () => {
  test("updates canonical requests from the request datatype", () => {
    const base = LLM.request({
      id: "req_1",
-      model: LLM.model({ id: "fake-model", provider: "fake", route: "openai-chat", baseURL: "https://fake.local" }),
+      model: Model.make({ id: "fake-model", provider: "fake", route: chatRoute }),
      prompt: "Say hello.",
    })
    const updated = LLMRequest.update(base, { messages: [...base.messages, Message.assistant("Hi.")] })
@@ -80,14 +85,18 @@ describe("llm constructors", () => {
  })

  test("updates canonical models from the model datatype", () => {
-    const base = LLM.model({ id: "fake-model", provider: "fake", route: "openai-chat", baseURL: "https://fake.local" })
-    const updated = ModelRef.update(base, { route: "openai-responses" })
+    const base = Model.make({
+      id: "fake-model",
+      provider: "fake",
+      route: chatRoute,
+    })
+    const updated = Model.update(base, { route: responsesRoute })

-    expect(updated).toBeInstanceOf(ModelRef)
+    expect(updated).toBeInstanceOf(Model)
    expect(String(updated.id)).toBe("fake-model")
-    expect(updated.route).toBe("openai-responses")
-    expect(String(ModelRef.input(updated).provider)).toBe("fake")
-    expect(ModelRef.update(updated, {})).toBe(updated)
+    expect(updated.route).toBe(responsesRoute)
+    expect(String(Model.input(updated).provider)).toBe("fake")
+    expect(Model.update(updated, {})).toBe(updated)
  })

  test("builds tool choices from names and tools", () => {
@@ -105,7 +114,11 @@ describe("llm constructors", () => {
    expect(ToolChoice.make("required")).toEqual(new ToolChoice({ type: "required" }))
    expect(
      LLM.request({
-        model: LLM.model({ id: "fake-model", provider: "fake", route: "openai-chat", baseURL: "https://fake.local" }),
+        model: Model.make({
+          id: "fake-model",
+          provider: "fake",
+          route: chatRoute,
+        }),
        prompt: "Use tools if needed.",
        toolChoice: "required",
      }).toolChoice,
--- a/packages/llm/test/provider.types.ts
+++ b/packages/llm/test/provider.types.ts
@@ -1,9 +1,9 @@
 import { Provider } from "../src/provider"
-import { ProviderID, type ModelRef } from "../src/schema"
+import { ProviderID, type Model } from "../src/schema"

-declare const model: (id: string) => ModelRef
-declare const requiredModel: (id: string, options: { readonly baseURL: string }) => ModelRef
-declare const chat: (id: string, options: { readonly apiKey: string }) => ModelRef
+declare const model: (id: string) => Model
+declare const requiredModel: (id: string, options: { readonly baseURL: string }) => Model
+declare const chat: (id: string, options: { readonly apiKey: string }) => Model

 Provider.make({
  id: ProviderID.make("example"),
@@ -22,6 +22,8 @@ const requiredProvider = Provider.make({
  model: requiredModel,
 })

+// Provider.make is advanced structural typing coverage; built-in providers use
+// configure(...).model(id) facades instead of second-argument selectors.
 requiredProvider.model("custom", { baseURL: "https://example.com/v1" })

 // @ts-expect-error Provider.make preserves required model options.
--- a/packages/llm/test/provider/anthropic-messages-cache.recorded.test.ts
+++ b/packages/llm/test/provider/anthropic-messages-cache.recorded.test.ts
@@ -3,14 +3,13 @@ import { describe, expect } from "bun:test"
 import { Effect } from "effect"
 import { CacheHint, LLM } from "../../src"
 import { LLMClient } from "../../src/route"
-import * as AnthropicMessages from "../../src/protocols/anthropic-messages"
+import * as Anthropic from "../../src/providers/anthropic"
 import { LARGE_CACHEABLE_SYSTEM } from "../recorded-scenarios"
 import { recordedTests } from "../recorded-test"

-const model = AnthropicMessages.model({
-  id: "claude-haiku-4-5-20251001",
+const model = Anthropic.configure({
  apiKey: process.env.ANTHROPIC_API_KEY ?? "fixture",
-})
+}).model("claude-haiku-4-5-20251001")

 // Two identical generations in a row. The first call writes the prefix into
 // Anthropic's cache; the second should report a cache read against the same
--- a/packages/llm/test/provider/anthropic-messages.recorded.test.ts
+++ b/packages/llm/test/provider/anthropic-messages.recorded.test.ts
@@ -3,14 +3,13 @@ import { describe, expect } from "bun:test"
 import { Effect } from "effect"
 import { LLM, LLMError, Message, ToolCallPart } from "../../src"
 import { LLMClient } from "../../src/route"
-import * as AnthropicMessages from "../../src/protocols/anthropic-messages"
+import * as Anthropic from "../../src/providers/anthropic"
 import { weatherToolName } from "../recorded-scenarios"
 import { recordedTests } from "../recorded-test"

-const model = AnthropicMessages.model({
-  id: "claude-haiku-4-5-20251001",
+const model = Anthropic.configure({
  apiKey: process.env.ANTHROPIC_API_KEY ?? "fixture",
-})
+}).model("claude-haiku-4-5-20251001")

 const malformedToolOrderRequest = LLM.request({
  id: "recorded_anthropic_malformed_tool_order",
--- a/packages/llm/test/provider/anthropic-messages.test.ts
+++ b/packages/llm/test/provider/anthropic-messages.test.ts
@@ -1,17 +1,15 @@
 import { describe, expect } from "bun:test"
 import { Effect } from "effect"
 import { CacheHint, LLM, LLMError, Message, ToolCallPart, Usage } from "../../src"
-import { LLMClient } from "../../src/route"
+import { Auth, LLMClient } from "../../src/route"
 import * as AnthropicMessages from "../../src/protocols/anthropic-messages"
 import { it } from "../lib/effect"
 import { fixedResponse } from "../lib/http"
 import { sseEvents } from "../lib/sse"

-const model = AnthropicMessages.model({
-  id: "claude-sonnet-4-5",
-  baseURL: "https://api.anthropic.test/v1/",
-  headers: { "x-api-key": "test" },
-})
+const model = AnthropicMessages.route
+  .with({ endpoint: { baseURL: "https://api.anthropic.test/v1/" }, auth: Auth.header("x-api-key", "test") })
+  .model({ id: "claude-sonnet-4-5" })

 const request = LLM.request({
  id: "req_1",
--- a/packages/llm/test/provider/bedrock-converse-cache.recorded.test.ts
+++ b/packages/llm/test/provider/bedrock-converse-cache.recorded.test.ts
@@ -2,7 +2,7 @@ import { describe, expect } from "bun:test"
 import { Effect } from "effect"
 import { CacheHint, LLM } from "../../src"
 import { LLMClient } from "../../src/route"
-import * as BedrockConverse from "../../src/protocols/bedrock-converse"
+import { AmazonBedrock } from "../../src/providers"
 import { LARGE_CACHEABLE_SYSTEM } from "../recorded-scenarios"
 import { recordedTests } from "../recorded-test"

@@ -12,15 +12,14 @@ const RECORDING_REGION = process.env.BEDROCK_RECORDING_REGION ?? "us-east-1"
 // doesn't reliably surface `cacheRead`/`cacheWrite` in usage, so the second
 // call wouldn't deterministically prove cache mapping works. Override with
 // BEDROCK_CACHE_MODEL_ID if your account has access elsewhere.
-const model = BedrockConverse.model({
-  id: process.env.BEDROCK_CACHE_MODEL_ID ?? "us.anthropic.claude-haiku-4-5-20251001-v1:0",
+const model = AmazonBedrock.configure({
  credentials: {
    region: RECORDING_REGION,
    accessKeyId: process.env.AWS_ACCESS_KEY_ID ?? "fixture",
    secretAccessKey: process.env.AWS_SECRET_ACCESS_KEY ?? "fixture",
    sessionToken: process.env.AWS_SESSION_TOKEN,
  },
-})
+}).model(process.env.BEDROCK_CACHE_MODEL_ID ?? "us.anthropic.claude-haiku-4-5-20251001-v1:0")

 const cacheRequest = LLM.request({
  id: "recorded_bedrock_cache",
--- a/packages/llm/test/provider/bedrock-converse.test.ts
+++ b/packages/llm/test/provider/bedrock-converse.test.ts
@@ -4,6 +4,7 @@ import { describe, expect } from "bun:test"
 import { Effect } from "effect"
 import { CacheHint, LLM, Message, ToolCallPart, ToolChoice } from "../../src"
 import { LLMClient } from "../../src/route"
+import { AmazonBedrock } from "../../src/providers"
 import * as BedrockConverse from "../../src/protocols/bedrock-converse"
 import { it } from "../lib/effect"
 import { fixedResponse } from "../lib/http"
@@ -52,11 +53,10 @@ const eventStreamBody = (...payloads: ReadonlyArray<readonly [string, object]>)
 const fixedBytes = (bytes: Uint8Array) =>
  fixedResponse(bytes.slice().buffer, { headers: { "content-type": "application/vnd.amazon.eventstream" } })

-const model = BedrockConverse.model({
-  id: "anthropic.claude-3-5-sonnet-20240620-v1:0",
+const model = AmazonBedrock.configure({
  baseURL: "https://bedrock-runtime.test",
  apiKey: "test-bearer",
-})
+}).model("anthropic.claude-3-5-sonnet-20240620-v1:0")

 const baseRequest = LLM.request({
  id: "req_1",
@@ -156,6 +156,55 @@ describe("Bedrock Converse route", () => {
    }),
  )

+  it.effect("lowers image content in tool-result messages", () =>
+    Effect.gen(function* () {
+      const prepared = yield* LLMClient.prepare(
+        LLM.request({
+          id: "req_tool_image",
+          model,
+          messages: [
+            Message.user("Capture the screen."),
+            Message.assistant([ToolCallPart.make({ id: "tool_1", name: "screenshot", input: {} })]),
+            Message.tool({
+              id: "tool_1",
+              name: "screenshot",
+              result: {
+                type: "content",
+                value: [
+                  { type: "text", text: "Screenshot captured." },
+                  { type: "media", mediaType: "image/png", data: "AAAA" },
+                ],
+              },
+            }),
+          ],
+          cache: "none",
+        }),
+      )
+
+      expect(prepared.body).toMatchObject({
+        messages: [
+          { role: "user", content: [{ text: "Capture the screen." }] },
+          {
+            role: "assistant",
+            content: [{ toolUse: { toolUseId: "tool_1", name: "screenshot", input: {} } }],
+          },
+          {
+            role: "user",
+            content: [
+              {
+                toolResult: {
+                  toolUseId: "tool_1",
+                  content: [{ text: "Screenshot captured." }, { image: { format: "png", source: { bytes: "AAAA" } } }],
+                  status: "success",
+                },
+              },
+            ],
+          },
+        ],
+      })
+    }),
+  )
+
  it.effect("decodes text-delta + messageStop + metadata usage from binary event stream", () =>
    Effect.gen(function* () {
      const body = eventStreamBody(
@@ -249,39 +298,32 @@ describe("Bedrock Converse route", () => {

  it.effect("rejects requests with no auth path", () =>
    Effect.gen(function* () {
-      const unsignedModel = BedrockConverse.model({
-        id: "anthropic.claude-3-5-sonnet-20240620-v1:0",
+      const unsignedModel = AmazonBedrock.configure({
        baseURL: "https://bedrock-runtime.test",
-      })
+      }).model("anthropic.claude-3-5-sonnet-20240620-v1:0")
      const error = yield* LLMClient.generate(LLM.updateRequest(baseRequest, { model: unsignedModel })).pipe(
        Effect.provide(fixedBytes(eventStreamBody(["messageStop", { stopReason: "end_turn" }]))),
        Effect.flip,
      )

-      expect(error.message).toContain("Bedrock Converse requires either model.apiKey")
+      expect(error.message).toContain("Bedrock Converse requires either route bearer auth or AWS credentials")
    }),
  )

  it.effect("signs requests with SigV4 when AWS credentials are provided (deterministic plumbing check)", () =>
    Effect.gen(function* () {
-      const signed = BedrockConverse.model({
-        id: "anthropic.claude-3-5-sonnet-20240620-v1:0",
+      const signed = AmazonBedrock.configure({
        baseURL: "https://bedrock-runtime.test",
        credentials: {
          region: "us-east-1",
          accessKeyId: "AKIAIOSFODNN7EXAMPLE",
          secretAccessKey: "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY",
        },
-      })
+      }).model("anthropic.claude-3-5-sonnet-20240620-v1:0")
      const prepared = yield* LLMClient.prepare(LLM.updateRequest(baseRequest, { model: signed }))

      expect(prepared.route).toBe("bedrock-converse")
-      // The prepare phase doesn't sign — toHttp does. We assert the credential
-      // is plumbed onto the model native field for the signer to find.
-      expect(prepared.model.native).toMatchObject({
-        aws_credentials: { region: "us-east-1", accessKeyId: "AKIAIOSFODNN7EXAMPLE" },
-        aws_region: "us-east-1",
-      })
+      expect(prepared.model).toBe(signed)
    }),
  )

@@ -531,18 +573,17 @@ describe("Bedrock Converse route", () => {
 const RECORDING_REGION = process.env.BEDROCK_RECORDING_REGION ?? "us-east-1"

 const recordedModel = () =>
-  BedrockConverse.model({
+  AmazonBedrock.configure({
    // Most newer Anthropic models on Bedrock require a cross-region inference
    // profile (`us.` prefix). Nova does not require an Anthropic use-case form
    // and is on-demand-throughput accessible by default for most accounts.
-    id: process.env.BEDROCK_MODEL_ID ?? "us.amazon.nova-micro-v1:0",
    credentials: {
      region: RECORDING_REGION,
      accessKeyId: process.env.AWS_ACCESS_KEY_ID ?? "fixture",
      secretAccessKey: process.env.AWS_SECRET_ACCESS_KEY ?? "fixture",
      sessionToken: process.env.AWS_SESSION_TOKEN,
    },
-  })
+  }).model(process.env.BEDROCK_MODEL_ID ?? "us.amazon.nova-micro-v1:0")

 const recorded = recordedTests({
  prefix: "bedrock-converse",
@@ -598,7 +639,6 @@ describe("Bedrock Converse recorded", () => {

  recorded.effect.with("drives a tool loop", { tags: ["tool", "tool-loop", "golden"] }, () =>
    Effect.gen(function* () {
-      const llm = yield* LLMClient.Service
      expectWeatherToolLoop(
        yield* runWeatherToolLoop(
          weatherToolLoopRequest({
--- a/packages/llm/test/provider/cloudflare.test.ts
+++ b/packages/llm/test/provider/cloudflare.test.ts
@@ -2,7 +2,7 @@ import { describe, expect } from "bun:test"
 import { ConfigProvider, Effect, Schema } from "effect"
 import { HttpClientRequest } from "effect/unstable/http"
 import { LLM } from "../../src"
-import * as Cloudflare from "../../src/providers/cloudflare"
+import { CloudflareAIGateway, CloudflareWorkersAI } from "../../src/providers/cloudflare"
 import { LLMClient } from "../../src/route"
 import { it } from "../lib/effect"
 import { dynamicResponse } from "../lib/http"
@@ -21,18 +21,18 @@ const deltaChunk = (delta: object, finishReason: string | null = null) => ({
 describe("Cloudflare", () => {
  it.effect("prepares AI Gateway models through the OpenAI-compatible Chat protocol", () =>
    Effect.gen(function* () {
-      const model = Cloudflare.aiGateway("workers-ai/@cf/meta/llama-3.3-70b-instruct", {
+      const model = CloudflareAIGateway.configure({
        accountId: "test-account",
        gatewayId: "test-gateway",
        apiKey: "test-token",
-      })
+      }).model("workers-ai/@cf/meta/llama-3.3-70b-instruct")

      expect(model).toMatchObject({
        id: "workers-ai/@cf/meta/llama-3.3-70b-instruct",
        provider: "cloudflare-ai-gateway",
-        route: "cloudflare-ai-gateway",
-        baseURL: "https://gateway.ai.cloudflare.com/v1/test-account/test-gateway/compat",
+        route: { id: "cloudflare-ai-gateway" },
      })
+      expect(model.route.endpoint.baseURL).toBe("https://gateway.ai.cloudflare.com/v1/test-account/test-gateway/compat")

      const prepared = yield* LLMClient.prepare(LLM.request({ model, prompt: "Say hello." }))

@@ -49,11 +49,11 @@ describe("Cloudflare", () => {
    Effect.gen(function* () {
      const response = yield* LLM.generate(
        LLM.request({
-          model: Cloudflare.aiGateway("openai/gpt-4o-mini", {
+          model: CloudflareAIGateway.configure({
            accountId: "test-account",
            gatewayId: "test-gateway",
            apiKey: "test-token",
-          }),
+          }).model("openai/gpt-4o-mini"),
          prompt: "Say hello.",
        }),
      ).pipe(
@@ -86,11 +86,11 @@ describe("Cloudflare", () => {
  it.effect("defaults AI Gateway id to default when omitted or blank", () =>
    Effect.gen(function* () {
      expect(
-        Cloudflare.aiGateway("workers-ai/@cf/meta/llama-3.3-70b-instruct", {
+        CloudflareAIGateway.configure({
          accountId: "test-account",
          gatewayId: "",
          gatewayApiKey: "test-token",
-        }).baseURL,
+        }).model("workers-ai/@cf/meta/llama-3.3-70b-instruct").route.endpoint.baseURL,
      ).toBe("https://gateway.ai.cloudflare.com/v1/test-account/default/compat")
    }),
  )
@@ -99,11 +99,11 @@ describe("Cloudflare", () => {
    Effect.gen(function* () {
      yield* LLM.generate(
        LLM.request({
-          model: Cloudflare.aiGateway("openai/gpt-4o-mini", {
+          model: CloudflareAIGateway.configure({
            accountId: "test-account",
            gatewayApiKey: "gateway-token",
            apiKey: "provider-token",
-          }),
+          }).model("openai/gpt-4o-mini"),
          prompt: "Say hello.",
        }),
      ).pipe(
@@ -129,31 +129,31 @@ describe("Cloudflare", () => {
    Effect.gen(function* () {
      const prepared = yield* LLMClient.prepare(
        LLM.request({
-          model: Cloudflare.aiGateway("openai/gpt-4o-mini", {
+          model: CloudflareAIGateway.configure({
            baseURL: "https://gateway.proxy.test/v1/custom/compat",
            apiKey: "test-token",
-          }),
+          }).model("openai/gpt-4o-mini"),
          prompt: "Say hello.",
        }),
      )

-      expect(prepared.model.baseURL).toBe("https://gateway.proxy.test/v1/custom/compat")
+      expect(prepared.model.route.endpoint.baseURL).toBe("https://gateway.proxy.test/v1/custom/compat")
    }),
  )

  it.effect("prepares direct Workers AI models through the OpenAI-compatible Chat protocol", () =>
    Effect.gen(function* () {
-      const model = Cloudflare.workersAI("@cf/meta/llama-3.1-8b-instruct", {
+      const model = CloudflareWorkersAI.configure({
        accountId: "test-account",
        apiKey: "test-token",
-      })
+      }).model("@cf/meta/llama-3.1-8b-instruct")

      expect(model).toMatchObject({
        id: "@cf/meta/llama-3.1-8b-instruct",
        provider: "cloudflare-workers-ai",
-        route: "cloudflare-workers-ai",
-        baseURL: "https://api.cloudflare.com/client/v4/accounts/test-account/ai/v1",
+        route: { id: "cloudflare-workers-ai" },
      })
+      expect(model.route.endpoint.baseURL).toBe("https://api.cloudflare.com/client/v4/accounts/test-account/ai/v1")

      const prepared = yield* LLMClient.prepare(LLM.request({ model, prompt: "Say hello." }))

@@ -170,10 +170,10 @@ describe("Cloudflare", () => {
    Effect.gen(function* () {
      const response = yield* LLM.generate(
        LLM.request({
-          model: Cloudflare.workersAI("@cf/meta/llama-3.1-8b-instruct", {
+          model: CloudflareWorkersAI.configure({
            accountId: "test-account",
            apiKey: "test-token",
-          }),
+          }).model("@cf/meta/llama-3.1-8b-instruct"),
          prompt: "Say hello.",
        }),
      ).pipe(
@@ -205,9 +205,9 @@ describe("Cloudflare", () => {
    Effect.gen(function* () {
      yield* LLM.generate(
        LLM.request({
-          model: Cloudflare.workersAI("@cf/meta/llama-3.1-8b-instruct", {
+          model: CloudflareWorkersAI.configure({
            accountId: "test-account",
-          }),
+          }).model("@cf/meta/llama-3.1-8b-instruct"),
          prompt: "Say hello.",
        }),
      ).pipe(
--- a/packages/llm/test/provider/gemini-cache.recorded.test.ts
+++ b/packages/llm/test/provider/gemini-cache.recorded.test.ts
@@ -2,14 +2,13 @@ import { describe, expect } from "bun:test"
 import { Effect } from "effect"
 import { LLM } from "../../src"
 import { LLMClient } from "../../src/route"
-import * as Gemini from "../../src/protocols/gemini"
+import * as Google from "../../src/providers/google"
 import { LARGE_CACHEABLE_SYSTEM } from "../recorded-scenarios"
 import { recordedTests } from "../recorded-test"

-const model = Gemini.model({
-  id: "gemini-2.5-flash",
+const model = Google.configure({
  apiKey: process.env.GOOGLE_GENERATIVE_AI_API_KEY ?? process.env.GEMINI_API_KEY ?? "fixture",
-})
+}).model("gemini-2.5-flash")

 // Gemini does implicit prefix caching on 2.5+ models above ~1024 tokens. The
 // `CacheHint` is currently a no-op for Gemini (the explicit `CachedContent`
--- a/packages/llm/test/provider/gemini.test.ts
+++ b/packages/llm/test/provider/gemini.test.ts
@@ -1,17 +1,18 @@
 import { describe, expect } from "bun:test"
 import { Effect } from "effect"
 import { LLM, LLMError, Message, ToolCallPart, Usage } from "../../src"
-import { LLMClient } from "../../src/route"
+import { Auth, LLMClient } from "../../src/route"
 import * as Gemini from "../../src/protocols/gemini"
 import { it } from "../lib/effect"
 import { fixedResponse } from "../lib/http"
 import { sseEvents, sseRaw } from "../lib/sse"

-const model = Gemini.model({
-  id: "gemini-2.5-flash",
-  baseURL: "https://generativelanguage.test/v1beta/",
-  headers: { "x-goog-api-key": "test" },
-})
+const model = Gemini.route
+  .with({
+    endpoint: { baseURL: "https://generativelanguage.test/v1beta/" },
+    auth: Auth.header("x-goog-api-key", "test"),
+  })
+  .model({ id: "gemini-2.5-flash" })

 const request = LLM.request({
  id: "req_1",
--- a/packages/llm/test/provider/golden.recorded.test.ts
+++ b/packages/llm/test/provider/golden.recorded.test.ts
@@ -1,32 +1,30 @@
 import { Redactor } from "@opencode-ai/http-recorder"
-import * as AnthropicMessages from "../../src/protocols/anthropic-messages"
-import * as Gemini from "../../src/protocols/gemini"
-import * as OpenAIChat from "../../src/protocols/openai-chat"
-import * as OpenAIResponses from "../../src/protocols/openai-responses"
-import * as Cloudflare from "../../src/providers/cloudflare"
+import * as Anthropic from "../../src/providers/anthropic"
+import { CloudflareAIGateway, CloudflareWorkersAI } from "../../src/providers/cloudflare"
+import * as Google from "../../src/providers/google"
 import * as OpenAI from "../../src/providers/openai"
 import * as OpenAICompatible from "../../src/providers/openai-compatible"
 import * as OpenRouter from "../../src/providers/openrouter"
 import * as XAI from "../../src/providers/xai"
 import { describeRecordedGoldenScenarios } from "../recorded-golden"

-const openAIChat = OpenAIChat.model({ id: "gpt-4o-mini", apiKey: process.env.OPENAI_API_KEY ?? "fixture" })
-const openAIResponses = OpenAIResponses.model({ id: "gpt-5.5", apiKey: process.env.OPENAI_API_KEY ?? "fixture" })
-const openAIResponsesWebSocket = OpenAI.responsesWebSocket("gpt-4.1-mini", {
+const openAI = OpenAI.configure({
  apiKey: process.env.OPENAI_API_KEY ?? "fixture",
 })
-const anthropicHaiku = AnthropicMessages.model({
-  id: "claude-haiku-4-5-20251001",
+const openAIChat = openAI.chat("gpt-4o-mini")
+const openAIResponses = openAI.responses("gpt-5.5")
+const openAIResponsesWebSocket = openAI.responsesWebSocket("gpt-4.1-mini")
+const anthropic = Anthropic.configure({
  apiKey: process.env.ANTHROPIC_API_KEY ?? "fixture",
 })
-const anthropicOpus = AnthropicMessages.model({
-  id: "claude-opus-4-7",
-  apiKey: process.env.ANTHROPIC_API_KEY ?? "fixture",
-})
-const gemini = Gemini.model({ id: "gemini-2.5-flash", apiKey: process.env.GOOGLE_GENERATIVE_AI_API_KEY ?? "fixture" })
-const xaiBasic = XAI.model("grok-3-mini", { apiKey: process.env.XAI_API_KEY ?? "fixture" })
-const xaiFlagship = XAI.model("grok-4.3", { apiKey: process.env.XAI_API_KEY ?? "fixture" })
-const cloudflareAIGatewayWorkers = Cloudflare.aiGateway("workers-ai/@cf/meta/llama-3.1-8b-instruct", {
+const anthropicHaiku = anthropic.model("claude-haiku-4-5-20251001")
+const anthropicOpus = anthropic.model("claude-opus-4-7")
+const google = Google.configure({ apiKey: process.env.GOOGLE_GENERATIVE_AI_API_KEY ?? "fixture" })
+const gemini = google.model("gemini-2.5-flash")
+const xai = XAI.configure({ apiKey: process.env.XAI_API_KEY ?? "fixture" })
+const xaiBasic = xai.model("grok-3-mini")
+const xaiFlagship = xai.model("grok-4.3")
+const cloudflareAIGateway = CloudflareAIGateway.configure({
  accountId: process.env.CLOUDFLARE_ACCOUNT_ID ?? "fixture-account",
  gatewayId:
    process.env.CLOUDFLARE_GATEWAY_ID && process.env.CLOUDFLARE_GATEWAY_ID !== process.env.CLOUDFLARE_ACCOUNT_ID
@@ -34,32 +32,31 @@ const cloudflareAIGatewayWorkers = Cloudflare.aiGateway("workers-ai/@cf/meta/lla
      : undefined,
  gatewayApiKey: process.env.CLOUDFLARE_API_TOKEN ?? "fixture",
 })
-const cloudflareAIGatewayWorkersTools = Cloudflare.aiGateway("workers-ai/@cf/openai/gpt-oss-20b", {
-  accountId: process.env.CLOUDFLARE_ACCOUNT_ID ?? "fixture-account",
-  gatewayId:
-    process.env.CLOUDFLARE_GATEWAY_ID && process.env.CLOUDFLARE_GATEWAY_ID !== process.env.CLOUDFLARE_ACCOUNT_ID
-      ? process.env.CLOUDFLARE_GATEWAY_ID
-      : undefined,
-  gatewayApiKey: process.env.CLOUDFLARE_API_TOKEN ?? "fixture",
-})
-const cloudflareWorkersAI = Cloudflare.workersAI("@cf/meta/llama-3.1-8b-instruct", {
+const cloudflareWorkers = CloudflareWorkersAI.configure({
  accountId: process.env.CLOUDFLARE_ACCOUNT_ID ?? "fixture-account",
  apiKey: process.env.CLOUDFLARE_API_KEY ?? "fixture",
 })
-const cloudflareWorkersAITools = Cloudflare.workersAI("@cf/openai/gpt-oss-20b", {
-  accountId: process.env.CLOUDFLARE_ACCOUNT_ID ?? "fixture-account",
-  apiKey: process.env.CLOUDFLARE_API_KEY ?? "fixture",
-})
-const deepseek = OpenAICompatible.deepseek.model("deepseek-chat", { apiKey: process.env.DEEPSEEK_API_KEY ?? "fixture" })
-const together = OpenAICompatible.togetherai.model("meta-llama/Llama-3.3-70B-Instruct-Turbo", {
-  apiKey: process.env.TOGETHER_AI_API_KEY ?? "fixture",
-})
-const groq = OpenAICompatible.groq.model("llama-3.3-70b-versatile", { apiKey: process.env.GROQ_API_KEY ?? "fixture" })
-const openrouter = OpenRouter.model("openai/gpt-4o-mini", { apiKey: process.env.OPENROUTER_API_KEY ?? "fixture" })
-const openrouterGpt55 = OpenRouter.model("openai/gpt-5.5", { apiKey: process.env.OPENROUTER_API_KEY ?? "fixture" })
-const openrouterOpus = OpenRouter.model("anthropic/claude-opus-4.7", {
+const cloudflareAIGatewayWorkers = cloudflareAIGateway.model("workers-ai/@cf/meta/llama-3.1-8b-instruct")
+const cloudflareAIGatewayWorkersTools = cloudflareAIGateway.model("workers-ai/@cf/openai/gpt-oss-20b")
+const cloudflareWorkersAI = cloudflareWorkers.model("@cf/meta/llama-3.1-8b-instruct")
+const cloudflareWorkersAITools = cloudflareWorkers.model("@cf/openai/gpt-oss-20b")
+const deepseek = OpenAICompatible.deepseek
+  .configure({ apiKey: process.env.DEEPSEEK_API_KEY ?? "fixture" })
+  .model("deepseek-chat")
+const together = OpenAICompatible.togetherai
+  .configure({
+    apiKey: process.env.TOGETHER_AI_API_KEY ?? "fixture",
+  })
+  .model("meta-llama/Llama-3.3-70B-Instruct-Turbo")
+const groq = OpenAICompatible.groq
+  .configure({ apiKey: process.env.GROQ_API_KEY ?? "fixture" })
+  .model("llama-3.3-70b-versatile")
+const openRouter = OpenRouter.configure({ apiKey: process.env.OPENROUTER_API_KEY ?? "fixture" })
+const openrouter = openRouter.model("openai/gpt-4o-mini")
+const openrouterGpt55 = openRouter.model("openai/gpt-5.5")
+const openrouterOpus = OpenRouter.configure({
  apiKey: process.env.OPENROUTER_API_KEY ?? "fixture",
-})
+}).model("anthropic/claude-opus-4.7")

 const redactCloudflareURL = (url: string) =>
  url
@@ -120,7 +117,7 @@ describeRecordedGoldenScenarios([
    prefix: "gemini",
    model: gemini,
    requires: ["GOOGLE_GENERATIVE_AI_API_KEY"],
-    scenarios: [{ id: "text", maxTokens: 80 }, "tool-call"],
+    scenarios: [{ id: "text", maxTokens: 80 }, "tool-call", { id: "image", maxTokens: 160 }],
  },
  {
    name: "xAI Grok 3 Mini",
--- a/packages/llm/test/provider/openai-chat.test.ts
+++ b/packages/llm/test/provider/openai-chat.test.ts
@@ -1,11 +1,11 @@
 import { describe, expect } from "bun:test"
 import { Effect, Schema, Stream } from "effect"
 import { HttpClientRequest } from "effect/unstable/http"
-import { LLM, LLMError, Message, ToolCallPart, Usage } from "../../src"
+import { LLM, LLMError, Message, Model, ToolCallPart, Usage } from "../../src"
 import * as Azure from "../../src/providers/azure"
 import * as OpenAI from "../../src/providers/openai"
 import * as OpenAIChat from "../../src/protocols/openai-chat"
-import { LLMClient } from "../../src/route"
+import { Auth, LLMClient } from "../../src/route"
 import { it } from "../lib/effect"
 import { dynamicResponse, fixedResponse, truncatedStream } from "../lib/http"
 import { deltaChunk, usageChunk } from "../lib/openai-chunks"
@@ -15,11 +15,9 @@ const TargetJson = Schema.fromJsonString(Schema.Unknown)
 const encodeJson = Schema.encodeSync(TargetJson)
 const decodeJson = Schema.decodeUnknownSync(TargetJson)

-const model = OpenAIChat.model({
-  id: "gpt-4o-mini",
-  baseURL: "https://api.openai.test/v1/",
-  headers: { authorization: "Bearer test" },
-})
+const model = OpenAIChat.route
+  .with({ endpoint: { baseURL: "https://api.openai.test/v1/" }, auth: Auth.bearer("test") })
+  .model({ id: "gpt-4o-mini" })

 const request = LLM.request({
  id: "req_1",
@@ -56,7 +54,7 @@ describe("OpenAI Chat route", () => {
    Effect.gen(function* () {
      const prepared = yield* LLMClient.prepare<OpenAIChat.OpenAIChatBody>(
        LLM.request({
-          model: OpenAI.chat("gpt-4o-mini", { baseURL: "https://api.openai.test/v1/" }),
+          model: OpenAI.configure({ baseURL: "https://api.openai.test/v1/", apiKey: "test" }).chat("gpt-4o-mini"),
          prompt: "think",
          providerOptions: { openai: { reasoningEffort: "low" } },
        }),
@@ -69,7 +67,9 @@ describe("OpenAI Chat route", () => {

  it.effect("adds native query params to the Chat Completions URL", () =>
    LLMClient.generate(
-      LLM.updateRequest(request, { model: OpenAIChat.model({ ...model, queryParams: { "api-version": "v1" } }) }),
+      LLM.updateRequest(request, {
+        model: Model.update(model, { route: model.route.with({ endpoint: { query: { "api-version": "v1" } } }) }),
+      }),
    ).pipe(
      Effect.provide(
        dynamicResponse((input) =>
@@ -88,17 +88,18 @@ describe("OpenAI Chat route", () => {
  it.effect("uses Azure api-key header for static OpenAI Chat keys", () =>
    LLMClient.generate(
      LLM.updateRequest(request, {
-        model: Azure.chat("gpt-4o-mini", {
+        model: Azure.configure({
          baseURL: "https://opencode-test.openai.azure.com/openai/v1/",
          apiKey: "azure-key",
          headers: { authorization: "Bearer stale" },
-        }),
+        }).chat("gpt-4o-mini"),
      }),
    ).pipe(
      Effect.provide(
        dynamicResponse((input) =>
          Effect.gen(function* () {
            const web = yield* HttpClientRequest.toWeb(input.request).pipe(Effect.orDie)
+            expect(web.url).toBe("https://opencode-test.openai.azure.com/openai/v1/chat/completions?api-version=v1")
            expect(web.headers.get("api-key")).toBe("azure-key")
            expect(web.headers.get("authorization")).toBeNull()
            return input.respond(sseEvents(deltaChunk({}, "stop")), {
@@ -113,7 +114,9 @@ describe("OpenAI Chat route", () => {
  it.effect("applies serializable HTTP overlays after payload lowering", () =>
    LLMClient.generate(
      LLM.updateRequest(request, {
-        model: OpenAIChat.model({ ...model, apiKey: "fresh-key", headers: { authorization: "Bearer stale" } }),
+        model: model.route
+          .with({ auth: Auth.bearer("fresh-key"), headers: { authorization: "Bearer stale" } })
+          .model({ id: model.id }),
        http: {
          body: { metadata: { source: "test" } },
          headers: { authorization: "Bearer request", "x-custom": "yes" },
--- a/packages/llm/test/provider/openai-compatible-chat.test.ts
+++ b/packages/llm/test/provider/openai-compatible-chat.test.ts
@@ -2,7 +2,7 @@ import { describe, expect } from "bun:test"
 import { Effect, Schema } from "effect"
 import { HttpClientRequest } from "effect/unstable/http"
 import { LLM, Message, ToolCallPart } from "../../src"
-import { LLMClient } from "../../src/route"
+import { Auth, LLMClient } from "../../src/route"
 import * as OpenAICompatible from "../../src/providers/openai-compatible"
 import * as OpenAICompatibleChat from "../../src/protocols/openai-compatible-chat"
 import { it } from "../lib/effect"
@@ -12,13 +12,13 @@ import { sseEvents } from "../lib/sse"
 const Json = Schema.fromJsonString(Schema.Unknown)
 const decodeJson = Schema.decodeUnknownSync(Json)

-const model = OpenAICompatibleChat.model({
-  id: "deepseek-chat",
-  provider: "deepseek",
-  baseURL: "https://api.deepseek.test/v1/",
-  apiKey: "test-key",
-  queryParams: { "api-version": "2026-01-01" },
-})
+const model = OpenAICompatibleChat.route
+  .with({
+    provider: "deepseek",
+    endpoint: { baseURL: "https://api.deepseek.test/v1/", query: { "api-version": "2026-01-01" } },
+    auth: Auth.bearer("test-key"),
+  })
+  .model({ id: "deepseek-chat" })

 const request = LLM.request({
  id: "req_1",
@@ -63,10 +63,11 @@ describe("OpenAI-compatible Chat route", () => {
      expect(prepared.model).toMatchObject({
        id: "deepseek-chat",
        provider: "deepseek",
-        route: "openai-compatible-chat",
+        route: { id: "openai-compatible-chat" },
+      })
+      expect(prepared.model.route.endpoint).toMatchObject({
        baseURL: "https://api.deepseek.test/v1/",
-        apiKey: "test-key",
-        queryParams: { "api-version": "2026-01-01" },
+        query: { "api-version": "2026-01-01" },
      })
      expect(prepared.body).toEqual({
        model: "deepseek-chat",
@@ -93,13 +94,12 @@ describe("OpenAI-compatible Chat route", () => {
    Effect.gen(function* () {
      expect(
        providerFamilies.map(([provider, family]) => {
-          const model = family.model(`${provider}-model`, { apiKey: "test-key" })
+          const model = family.configure({ apiKey: "test-key" }).model(`${provider}-model`)
          return {
            id: String(model.id),
            provider: String(model.provider),
-            route: model.route,
-            baseURL: model.baseURL,
-            apiKey: model.apiKey,
+            route: model.route.id,
+            baseURL: model.route.endpoint.baseURL,
          }
        }),
      ).toEqual(
@@ -108,19 +108,20 @@ describe("OpenAI-compatible Chat route", () => {
          provider,
          route: "openai-compatible-chat",
          baseURL,
-          apiKey: "test-key",
        })),
      )

-      const custom = OpenAICompatible.deepseek.model("deepseek-chat", {
-        apiKey: "test-key",
-        baseURL: "https://custom.deepseek.test/v1",
-      })
+      const custom = OpenAICompatible.deepseek
+        .configure({
+          apiKey: "test-key",
+          baseURL: "https://custom.deepseek.test/v1",
+        })
+        .model("deepseek-chat")
      expect(custom).toMatchObject({
        provider: "deepseek",
-        route: "openai-compatible-chat",
-        baseURL: "https://custom.deepseek.test/v1",
+        route: { id: "openai-compatible-chat" },
      })
+      expect(custom.route.endpoint.baseURL).toBe("https://custom.deepseek.test/v1")
    }),
  )

--- a/packages/llm/test/provider/openai-responses-cache.recorded.test.ts
+++ b/packages/llm/test/provider/openai-responses-cache.recorded.test.ts
@@ -2,14 +2,13 @@ import { describe, expect } from "bun:test"
 import { Effect } from "effect"
 import { LLM } from "../../src"
 import { LLMClient } from "../../src/route"
-import * as OpenAIResponses from "../../src/protocols/openai-responses"
+import * as OpenAI from "../../src/providers/openai"
 import { LARGE_CACHEABLE_SYSTEM } from "../recorded-scenarios"
 import { recordedTests } from "../recorded-test"

-const model = OpenAIResponses.model({
-  id: "gpt-4.1-mini",
+const model = OpenAI.configure({
  apiKey: process.env.OPENAI_API_KEY ?? "fixture",
-})
+}).responses("gpt-4.1-mini")

 // OpenAI caches prefixes automatically once they cross the 1024-token threshold;
 // `CacheHint` is a no-op for the wire body. The stable signal is the
--- a/packages/llm/test/provider/openai-responses.test.ts
+++ b/packages/llm/test/provider/openai-responses.test.ts
@@ -1,7 +1,7 @@
 import { describe, expect } from "bun:test"
 import { ConfigProvider, Effect, Layer, Stream } from "effect"
 import { Headers, HttpClientRequest } from "effect/unstable/http"
-import { LLM, LLMError, Message, ToolCallPart, Usage } from "../../src"
+import { LLM, LLMError, Message, Model, ToolCallPart, Usage } from "../../src"
 import { Auth, LLMClient, RequestExecutor, WebSocketExecutor } from "../../src/route"
 import * as Azure from "../../src/providers/azure"
 import * as OpenAI from "../../src/providers/openai"
@@ -11,11 +11,9 @@ import { it } from "../lib/effect"
 import { dynamicResponse, fixedResponse } from "../lib/http"
 import { sseEvents } from "../lib/sse"

-const model = OpenAIResponses.model({
-  id: "gpt-4.1-mini",
-  baseURL: "https://api.openai.test/v1/",
-  headers: { authorization: "Bearer test" },
-})
+const model = OpenAIResponses.route
+  .with({ endpoint: { baseURL: "https://api.openai.test/v1/" }, auth: Auth.bearer("test") })
+  .model({ id: "gpt-4.1-mini" })

 const request = LLM.request({
  id: "req_1",
@@ -49,7 +47,9 @@ describe("OpenAI Responses route", () => {
    Effect.gen(function* () {
      const prepared = yield* LLMClient.prepare(
        LLM.updateRequest(request, {
-          model: OpenAI.responsesWebSocket("gpt-4.1-mini", { baseURL: "https://api.openai.test/v1/", apiKey: "test" }),
+          model: OpenAI.configure({ baseURL: "https://api.openai.test/v1/", apiKey: "test" }).responsesWebSocket(
+            "gpt-4.1-mini",
+          ),
        }),
      )

@@ -95,10 +95,12 @@ describe("OpenAI Responses route", () => {
      )
      const response = yield* LLMClient.generate(
        LLM.request({
-          model: OpenAI.responsesWebSocket("gpt-4.1-mini", { baseURL: "https://api.openai.test/v1/", apiKey: "test" }),
+          model: OpenAI.configure({ baseURL: "https://api.openai.test/v1/", apiKey: "test" }).responsesWebSocket(
+            "gpt-4.1-mini",
+          ),
          prompt: "Say hello.",
        }),
-      ).pipe(Effect.provide(LLMClient.layerWithWebSocket.pipe(Layer.provide(deps))))
+      ).pipe(Effect.provide(LLMClient.layer.pipe(Layer.provide(deps))))

      expect(response.text).toBe("Hi")
      expect(opened).toEqual([{ url: "wss://api.openai.test/v1/responses", authorization: "Bearer test" }])
@@ -113,33 +115,6 @@ describe("OpenAI Responses route", () => {
    }),
  )

-  it.effect("requires WebSocket runtime for OpenAI Responses WebSocket", () =>
-    Effect.gen(function* () {
-      const error = yield* LLMClient.generate(
-        LLM.request({
-          model: OpenAI.responsesWebSocket("gpt-4.1-mini", { baseURL: "https://api.openai.test/v1/", apiKey: "test" }),
-          prompt: "Say hello.",
-        }),
-      ).pipe(
-        Effect.provide(
-          LLMClient.layer.pipe(
-            Layer.provide(
-              Layer.succeed(
-                RequestExecutor.Service,
-                RequestExecutor.Service.of({
-                  execute: () => Effect.die("unexpected HTTP request"),
-                }),
-              ),
-            ),
-          ),
-        ),
-        Effect.flip,
-      )
-
-      expect(error.message).toContain("requires WebSocketExecutor.Service")
-    }),
-  )
-
  it.effect("fails immediately when WebSocket is already closed", () =>
    Effect.gen(function* () {
      const error = yield* WebSocketExecutor.fromWebSocket(
@@ -155,7 +130,7 @@ describe("OpenAI Responses route", () => {
    Effect.gen(function* () {
      yield* LLMClient.generate(
        LLM.updateRequest(request, {
-          model: OpenAIResponses.model({ ...model, queryParams: { "api-version": "v1" } }),
+          model: Model.update(model, { route: model.route.with({ endpoint: { query: { "api-version": "v1" } } }) }),
        }),
      ).pipe(
        Effect.provide(
@@ -177,17 +152,18 @@ describe("OpenAI Responses route", () => {
    Effect.gen(function* () {
      yield* LLMClient.generate(
        LLM.updateRequest(request, {
-          model: Azure.responses("gpt-4.1-mini", {
+          model: Azure.configure({
            baseURL: "https://opencode-test.openai.azure.com/openai/v1/",
            apiKey: "azure-key",
            headers: { authorization: "Bearer stale" },
-          }),
+          }).responses("gpt-4.1-mini"),
        }),
      ).pipe(
        Effect.provide(
          dynamicResponse((input) =>
            Effect.gen(function* () {
              const web = yield* HttpClientRequest.toWeb(input.request).pipe(Effect.orDie)
+              expect(web.url).toBe("https://opencode-test.openai.azure.com/openai/v1/responses?api-version=v1")
              expect(web.headers.get("api-key")).toBe("azure-key")
              expect(web.headers.get("authorization")).toBeNull()
              return input.respond(sseEvents({ type: "response.completed", response: {} }), {
@@ -203,7 +179,7 @@ describe("OpenAI Responses route", () => {
  it.effect("loads OpenAI default auth from Effect Config", () =>
    LLMClient.generate(
      LLM.updateRequest(request, {
-        model: OpenAI.responses("gpt-4.1-mini", { baseURL: "https://api.openai.test/v1/" }),
+        model: OpenAI.configure({ baseURL: "https://api.openai.test/v1/" }).responses("gpt-4.1-mini"),
      }),
    ).pipe(
      configEnv({ OPENAI_API_KEY: "env-key" }),
@@ -224,10 +200,10 @@ describe("OpenAI Responses route", () => {
  it.effect("lets explicit auth override OpenAI default API key auth", () =>
    LLMClient.generate(
      LLM.updateRequest(request, {
-        model: OpenAI.responses("gpt-4.1-mini", {
+        model: OpenAI.configure({
          baseURL: "https://api.openai.test/v1/",
          auth: Auth.bearer("oauth-token"),
-        }),
+        }).responses("gpt-4.1-mini"),
      }),
    ).pipe(
      Effect.provide(
@@ -274,7 +250,7 @@ describe("OpenAI Responses route", () => {
    Effect.gen(function* () {
      const prepared = yield* LLMClient.prepare<OpenAIResponses.OpenAIResponsesBody>(
        LLM.request({
-          model: OpenAI.model("gpt-5.2", { baseURL: "https://api.openai.test/v1/" }),
+          model: OpenAI.configure({ baseURL: "https://api.openai.test/v1/", apiKey: "test" }).model("gpt-5.2"),
          prompt: "think",
          providerOptions: {
            openai: {
@@ -295,14 +271,15 @@ describe("OpenAI Responses route", () => {
    }),
  )

-  it.effect("request OpenAI provider options override model defaults", () =>
+  it.effect("request OpenAI provider options override route defaults", () =>
    Effect.gen(function* () {
      const prepared = yield* LLMClient.prepare<OpenAIResponses.OpenAIResponsesBody>(
        LLM.request({
-          model: OpenAI.model("gpt-4.1-mini", {
+          model: OpenAI.configure({
            baseURL: "https://api.openai.test/v1/",
+            apiKey: "test",
            providerOptions: { openai: { promptCacheKey: "model_cache" } },
-          }),
+          }).model("gpt-4.1-mini"),
          prompt: "no cache",
          providerOptions: { openai: { promptCacheKey: "request_cache" } },
        }),
@@ -532,17 +509,36 @@ describe("OpenAI Responses route", () => {
    }),
  )

+  it.effect("lowers user image content", () =>
+    Effect.gen(function* () {
+      const prepared = yield* LLMClient.prepare<OpenAIResponses.OpenAIResponsesBody>(
+        LLM.request({
+          id: "req_media",
+          model,
+          messages: [Message.user({ type: "media", mediaType: "image/png", data: "AAECAw==" })],
+        }),
+      )
+
+      expect(prepared.body.input).toEqual([
+        {
+          role: "user",
+          content: [{ type: "input_image", image_url: "data:image/png;base64,AAECAw==" }],
+        },
+      ])
+    }),
+  )
+
  it.effect("rejects unsupported user media content", () =>
    Effect.gen(function* () {
      const error = yield* LLMClient.prepare(
        LLM.request({
          id: "req_media",
          model,
-          messages: [Message.user({ type: "media", mediaType: "image/png", data: "AAECAw==" })],
+          messages: [Message.user({ type: "media", mediaType: "application/pdf", data: "AAECAw==" })],
        }),
      ).pipe(Effect.flip)

-      expect(error.message).toContain("OpenAI Responses user messages only support text content for now")
+      expect(error.message).toContain("OpenAI Responses user media content only supports images")
    }),
  )

--- a/packages/llm/test/provider/openrouter.test.ts
+++ b/packages/llm/test/provider/openrouter.test.ts
@@ -8,15 +8,14 @@ import { it } from "../lib/effect"
 describe("OpenRouter", () => {
  it.effect("prepares OpenRouter models through the OpenAI-compatible Chat route", () =>
    Effect.gen(function* () {
-      const model = OpenRouter.model("openai/gpt-4o-mini", { apiKey: "test-key" })
+      const model = OpenRouter.configure({ apiKey: "test-key" }).model("openai/gpt-4o-mini")

      expect(model).toMatchObject({
        id: "openai/gpt-4o-mini",
        provider: "openrouter",
-        route: "openrouter",
-        baseURL: "https://openrouter.ai/api/v1",
-        apiKey: "test-key",
+        route: { id: "openrouter" },
      })
+      expect(model.route.endpoint.baseURL).toBe("https://openrouter.ai/api/v1")

      const prepared = yield* LLMClient.prepare(LLM.request({ model, prompt: "Say hello." }))

@@ -33,7 +32,8 @@ describe("OpenRouter", () => {
    Effect.gen(function* () {
      const prepared = yield* LLMClient.prepare(
        LLM.request({
-          model: OpenRouter.model("anthropic/claude-3.7-sonnet:thinking", {
+          model: OpenRouter.configure({
+            apiKey: "test-key",
            providerOptions: {
              openrouter: {
                usage: true,
@@ -41,7 +41,7 @@ describe("OpenRouter", () => {
                promptCacheKey: "session_123",
              },
            },
-          }),
+          }).model("anthropic/claude-3.7-sonnet:thinking"),
          prompt: "Think briefly.",
        }),
      )
--- a/packages/llm/test/recorded-golden.ts
+++ b/packages/llm/test/recorded-golden.ts
@@ -1,7 +1,7 @@
 import type { HttpRecorder } from "@opencode-ai/http-recorder"
 import { describe, type TestOptions } from "bun:test"
 import { Effect } from "effect"
-import type { ModelRef } from "../src"
+import type { Model } from "../src"
 import { goldenScenarioTags, runGoldenScenario, type GoldenScenarioID } from "./recorded-scenarios"
 import { recordedTests } from "./recorded-test"
 import { kebab } from "./recorded-utils"
@@ -22,7 +22,7 @@ type ScenarioInput =

 type TargetInput = {
  readonly name: string
-  readonly model: ModelRef
+  readonly model: Model
  readonly protocol?: string
  readonly requires?: ReadonlyArray<string>
  readonly transport?: Transport
@@ -38,19 +38,20 @@ const scenarioInput = (input: ScenarioInput) => (typeof input === "string" ? { i
 const scenarioTitle = (id: GoldenScenarioID) => {
  if (id === "text") return "streams text"
  if (id === "tool-call") return "streams tool call"
+  if (id === "image") return "reads image text"
  return "drives a tool loop"
 }

 const defaultPrefix = (target: TargetInput) => {
  if (target.prefix) return target.prefix
  const transport = target.transport === "websocket" ? "-websocket" : ""
-  return `${target.model.provider}-${target.protocol ?? target.model.route}${transport}`
+  return `${target.model.provider}-${target.protocol ?? target.model.route.id}${transport}`
 }

 const metadata = (target: TargetInput) => ({
  provider: target.model.provider,
  protocol: target.protocol,
-  route: target.model.route,
+  route: target.model.route.id,
  transport: target.transport ?? "http",
  model: target.model.id,
  ...target.metadata,
--- a/packages/llm/test/recorded-scenarios.ts
+++ b/packages/llm/test/recorded-scenarios.ts
@@ -1,6 +1,6 @@
 import { expect } from "bun:test"
 import { Effect, Schema, Stream } from "effect"
-import { LLM, LLMEvent, LLMResponse, ToolChoice, ToolDefinition, type LLMRequest, type ModelRef } from "../src"
+import { LLM, LLMEvent, LLMResponse, Message, ToolChoice, ToolDefinition, type LLMRequest, type Model } from "../src"
 import { LLMClient } from "../src/route"
 import { tool } from "../src/tool"

@@ -41,7 +41,7 @@ export const weatherRuntimeTool = tool({

 export const textRequest = (input: {
  readonly id: string
-  readonly model: ModelRef
+  readonly model: Model
  readonly prompt?: string
  readonly maxTokens?: number
  readonly temperature?: number | false
@@ -52,15 +52,17 @@ export const textRequest = (input: {
    system: "You are concise.",
    prompt: input.prompt ?? "Reply with exactly: Hello!",
    cache: "none",
+    providerOptions:
+      input.model.route.id === "gemini" ? { gemini: { thinkingConfig: { thinkingBudget: 0 } } } : undefined,
    generation:
      input.temperature === false
-        ? { maxTokens: input.maxTokens ?? 20 }
-        : { maxTokens: input.maxTokens ?? 20, temperature: input.temperature ?? 0 },
+        ? { maxTokens: input.maxTokens ?? 80 }
+        : { maxTokens: input.maxTokens ?? 80, temperature: input.temperature ?? 0 },
  })

 export const weatherToolRequest = (input: {
  readonly id: string
-  readonly model: ModelRef
+  readonly model: Model
  readonly maxTokens?: number
  readonly temperature?: number | false
 }) =>
@@ -80,7 +82,7 @@ export const weatherToolRequest = (input: {

 export const weatherToolLoopRequest = (input: {
  readonly id: string
-  readonly model: ModelRef
+  readonly model: Model
  readonly system?: string
  readonly maxTokens?: number
  readonly temperature?: number | false
@@ -99,7 +101,7 @@ export const weatherToolLoopRequest = (input: {

 export const goldenWeatherToolLoopRequest = (input: {
  readonly id: string
-  readonly model: ModelRef
+  readonly model: Model
  readonly maxTokens?: number
  readonly temperature?: number | false
 }) =>
@@ -108,6 +110,39 @@ export const goldenWeatherToolLoopRequest = (input: {
    system: "Use the get_weather tool exactly once. After the tool result, reply exactly: Paris is sunny.",
  })

+const RESTROOM_IMAGE_TEXT = "jiggling restroom prison"
+const restroomImage = () =>
+  Effect.promise(() => Bun.file(new URL("./fixtures/media/restroom.png", import.meta.url)).bytes()).pipe(
+    Effect.map((bytes) => Buffer.from(bytes).toString("base64")),
+  )
+
+export const imageRequest = (input: {
+  readonly id: string
+  readonly model: Model
+  readonly image: string
+  readonly maxTokens?: number
+  readonly temperature?: number | false
+}) =>
+  LLM.request({
+    id: input.id,
+    model: input.model,
+    system: "Read images carefully. Reply only with the visible text.",
+    messages: [
+      Message.user([
+        {
+          type: "text",
+          text: "The image contains exactly three lowercase English words. Read them left to right and reply with only those words.",
+        },
+        { type: "media", mediaType: "image/png", data: input.image },
+      ]),
+    ],
+    cache: "none",
+    generation:
+      input.temperature === false
+        ? { maxTokens: input.maxTokens ?? 20 }
+        : { maxTokens: input.maxTokens ?? 20, temperature: input.temperature ?? 0 },
+  })
+
 export const runWeatherToolLoop = (request: LLMRequest) =>
  LLMClient.stream({
    request,
@@ -158,20 +193,28 @@ export const expectGoldenWeatherToolLoop = (events: ReadonlyArray<LLMEvent>) =>
  expect(LLMResponse.text({ events }).trim()).toMatch(/^Paris is sunny\.?$/)
 }

-export type GoldenScenarioID = "text" | "tool-call" | "tool-loop"
+export type GoldenScenarioID = "text" | "tool-call" | "tool-loop" | "image"

 export interface GoldenScenarioContext {
  readonly id: string
-  readonly model: ModelRef
+  readonly model: Model
  readonly maxTokens?: number
  readonly temperature?: number | false
 }

 const generate = (request: LLMRequest) => LLMClient.generate(request)

+const normalizeImageText = (value: string) =>
+  value
+    .toLowerCase()
+    .replace(/[^a-z\s]/g, "")
+    .replace(/\s+/g, " ")
+    .trim()
+
 export const goldenScenarioTags = (id: GoldenScenarioID) => {
  if (id === "text") return ["text", "golden"]
  if (id === "tool-call") return ["tool", "tool-call", "golden"]
+  if (id === "image") return ["media", "image", "vision", "golden"]
  return ["tool", "tool-loop", "golden"]
 }

@@ -206,6 +249,21 @@ export const runGoldenScenario = (id: GoldenScenarioID, context: GoldenScenarioC
      return
    }

+    if (id === "image") {
+      const response = yield* generate(
+        imageRequest({
+          id: context.id,
+          model: context.model,
+          image: yield* restroomImage(),
+          maxTokens: context.maxTokens ?? 20,
+          temperature: context.temperature,
+        }),
+      )
+      expect(normalizeImageText(response.text)).toBe(RESTROOM_IMAGE_TEXT)
+      expectFinish(response.events, "stop")
+      return
+    }
+
    expectGoldenWeatherToolLoop(
      yield* runWeatherToolLoop(
        goldenWeatherToolLoopRequest({
--- a/packages/llm/test/recorded-test.ts
+++ b/packages/llm/test/recorded-test.ts
@@ -69,8 +69,6 @@ export const recordedTests = (options: RecordedTestsOptions) =>
        requestExecutor,
        webSocketCassetteLayer(cassette, { metadata: recorderMetadata, mode }),
      )
-      return Layer.mergeAll(deps, LLMClient.layerWithWebSocket.pipe(Layer.provide(deps))).pipe(
-        Layer.provide(cassetteService),
-      )
+      return Layer.mergeAll(deps, LLMClient.layer.pipe(Layer.provide(deps))).pipe(Layer.provide(cassetteService))
    },
  })
--- a/packages/llm/test/route.test.ts
+++ b/packages/llm/test/route.test.ts
@@ -0,0 +1,43 @@
+import { describe, expect, test } from "bun:test"
+import * as OpenAIChat from "../src/protocols/openai-chat"
+import { Auth } from "../src/route"
+
+describe("Route.with", () => {
+  test("merges endpoint query and header defaults while replacing auth and id", () => {
+    const auth = Auth.headers({ "x-auth": "new" })
+    const route = OpenAIChat.route
+      .with({
+        id: "base-chat",
+        endpoint: {
+          baseURL: "https://api.example.test/v1",
+          query: { keep: "base", base: "1" },
+        },
+        headers: { "x-base": "base", "x-override": "base" },
+        auth: Auth.headers({ "x-auth": "old" }),
+      })
+      .with({
+        id: "patched-chat",
+        endpoint: { query: { keep: "patch", patch: "1" } },
+        headers: { "x-override": "patch", "x-patch": "patch" },
+        auth,
+      })
+
+    expect(route.id).toBe("patched-chat")
+    expect(route.auth).toBe(auth)
+    expect(route.endpoint).toMatchObject({
+      baseURL: "https://api.example.test/v1",
+      path: "/chat/completions",
+      query: { keep: "patch", base: "1", patch: "1" },
+    })
+    expect(route.defaults.headers).toEqual({
+      "x-base": "base",
+      "x-override": "patch",
+      "x-patch": "patch",
+    })
+    expect(route.defaults.http?.headers).toEqual({
+      "x-base": "base",
+      "x-override": "patch",
+      "x-patch": "patch",
+    })
+  })
+})
--- a/packages/llm/test/schema.test.ts
+++ b/packages/llm/test/schema.test.ts
@@ -1,16 +1,19 @@
 import { describe, expect, test } from "bun:test"
 import { Schema } from "effect"
-import { ContentPart, LLMEvent, LLMRequest, ModelID, ModelLimits, ModelRef, ProviderID, Usage } from "../src/schema"
+import * as OpenAIChat from "../src/protocols/openai-chat"
+import * as OpenAIResponses from "../src/protocols/openai-responses"
+import { ContentPart, LLMEvent, LLMRequest, Model, ModelID, ProviderID, Usage } from "../src/schema"
 import { ProviderShared } from "../src/protocols/shared"

-const model = new ModelRef({
+const model = new Model({
  id: ModelID.make("fake-model"),
  provider: ProviderID.make("fake-provider"),
-  route: "openai-chat",
-  baseURL: "https://fake.local",
-  limits: new ModelLimits({}),
+  route: OpenAIChat.route,
 })

+const decodeLLMRequest = Schema.decodeUnknownSync(LLMRequest as unknown as Schema.Decoder<LLMRequest>)
+const decodeLLMEvent = Schema.decodeUnknownSync(LLMEvent as unknown as Schema.Decoder<LLMEvent>)
+
 describe("llm schema", () => {
  test("decodes a minimal request", () => {
    const input: unknown = {
@@ -22,26 +25,26 @@ describe("llm schema", () => {
      generation: {},
    }

-    const decoded = Schema.decodeUnknownSync(LLMRequest)(input)
+    const decoded = decodeLLMRequest(input)

    expect(decoded.id).toBe("req_1")
    expect(decoded.messages[0]?.content[0]?.type).toBe("text")
  })

  test("accepts custom route ids", () => {
-    const decoded = Schema.decodeUnknownSync(LLMRequest)({
-      model: { ...model, route: "custom-route" },
+    const decoded = decodeLLMRequest({
+      model: Model.update(model, { route: OpenAIResponses.route }),
      system: [],
      messages: [],
      tools: [],
      generation: {},
    })

-    expect(decoded.model.route).toBe("custom-route")
+    expect(decoded.model.route.id).toBe("openai-responses")
  })

  test("rejects invalid event type", () => {
-    expect(() => Schema.decodeUnknownSync(LLMEvent)({ type: "bogus" })).toThrow()
+    expect(() => decodeLLMEvent({ type: "bogus" })).toThrow()
  })

  test("finish constructors accept usage input", () => {
--- a/packages/llm/test/tool-runtime.test.ts
+++ b/packages/llm/test/tool-runtime.test.ts
@@ -1,7 +1,7 @@
 import { describe, expect } from "bun:test"
 import { Effect, Schema, Stream } from "effect"
 import { GenerationOptions, LLM, LLMEvent, LLMRequest, LLMResponse, ToolChoice } from "../src"
-import { LLMClient } from "../src/route"
+import { Auth, LLMClient } from "../src/route"
 import * as AnthropicMessages from "../src/protocols/anthropic-messages"
 import * as OpenAIChat from "../src/protocols/openai-chat"
 import { tool, ToolFailure, type ToolExecuteContext } from "../src/tool"
@@ -12,11 +12,9 @@ import { dynamicResponse, scriptedResponses } from "./lib/http"
 import { deltaChunk, finishChunk, toolCallChunk } from "./lib/openai-chunks"
 import { sseEvents } from "./lib/sse"

-const model = OpenAIChat.model({
-  id: "gpt-4o-mini",
-  baseURL: "https://api.openai.test/v1/",
-  headers: { authorization: "Bearer test" },
-})
+const model = OpenAIChat.route
+  .with({ endpoint: { baseURL: "https://api.openai.test/v1/" }, auth: Auth.bearer("test") })
+  .model({ id: "gpt-4o-mini" })
 const Json = Schema.fromJsonString(Schema.Unknown)
 const decodeJson = Schema.decodeUnknownSync(Json)

@@ -141,6 +139,45 @@ describe("LLMClient tools", () => {
    }),
  )

+  it.effect("preserves content tool results from dynamic tools", () =>
+    Effect.gen(function* () {
+      const screenshot = tool({
+        description: "Capture a screenshot.",
+        jsonSchema: { type: "object", properties: {} },
+        execute: () =>
+          Effect.succeed({
+            type: "content" as const,
+            value: [
+              { type: "text" as const, text: "Screenshot captured." },
+              { type: "media" as const, mediaType: "image/png", data: "AAAA" },
+            ],
+          }),
+      })
+
+      const events = Array.from(
+        yield* LLMClient.stream({ request: baseRequest, tools: { screenshot } }).pipe(
+          Stream.runCollect,
+          Effect.provide(
+            scriptedResponses([sseEvents(toolCallChunk("call_1", "screenshot", "{}"), finishChunk("tool_calls"))]),
+          ),
+        ),
+      )
+
+      expect(events.find(LLMEvent.is.toolResult)).toMatchObject({
+        type: "tool-result",
+        id: "call_1",
+        name: "screenshot",
+        result: {
+          type: "content",
+          value: [
+            { type: "text", text: "Screenshot captured." },
+            { type: "media", mediaType: "image/png", data: "AAAA" },
+          ],
+        },
+      })
+    }),
+  )
+
  it.effect("executes tool calls for one step without looping by default", () =>
    Effect.gen(function* () {
      const layer = scriptedResponses([
@@ -249,7 +286,9 @@ describe("LLMClient tools", () => {

      yield* TestToolRuntime.runTools({
        request: LLM.updateRequest(baseRequest, {
-          model: AnthropicMessages.model({ id: "claude-sonnet-4-5", apiKey: "test" }),
+          model: AnthropicMessages.route
+            .with({ auth: Auth.header("x-api-key", "test") })
+            .model({ id: "claude-sonnet-4-5" }),
        }),
        tools: { get_weather },
      }).pipe(Stream.runCollect, Effect.provide(layer))
@@ -496,7 +535,9 @@ describe("LLMClient tools", () => {
      const events = Array.from(
        yield* TestToolRuntime.runTools({
          request: LLM.updateRequest(baseRequest, {
-            model: AnthropicMessages.model({ id: "claude-sonnet-4-5", apiKey: "test" }),
+            model: AnthropicMessages.route
+              .with({ auth: Auth.header("x-api-key", "test") })
+              .model({ id: "claude-sonnet-4-5" }),
          }),
          tools: {},
        }).pipe(Stream.runCollect, Effect.provide(layer)),
--- a/packages/llm/test/tool.types.ts
+++ b/packages/llm/test/tool.types.ts
@@ -1,10 +1,11 @@
 import { Effect, Schema } from "effect"
 import { LLM } from "../src"
 import * as OpenAIChat from "../src/protocols/openai-chat"
+import { Auth } from "../src/route"
 import { tool } from "../src/tool"

 const request = LLM.request({
-  model: OpenAIChat.model({ id: "gpt-4o-mini", apiKey: "fixture" }),
+  model: OpenAIChat.route.with({ auth: Auth.bearer("fixture") }).model({ id: "gpt-4o-mini" }),
  prompt: "Use the tool.",
 })

--- a/packages/opencode/src/session/llm.ts
+++ b/packages/opencode/src/session/llm.ts
@@ -4,7 +4,7 @@ import { Context, Effect, Layer, Record } from "effect"
 import * as Stream from "effect/Stream"
 import { streamText, wrapLanguageModel, type ModelMessage, type Tool, tool as aiTool, jsonSchema } from "ai"
 import type { LLMEvent } from "@opencode-ai/llm"
-import { LLMClient, RequestExecutor } from "@opencode-ai/llm/route"
+import { LLMClient, RequestExecutor, WebSocketExecutor } from "@opencode-ai/llm/route"
 import type { LLMClientService } from "@opencode-ai/llm/route"
 import { mergeDeep } from "remeda"
 import { GitLabWorkflowLanguageModel } from "gitlab-ai-provider"
@@ -349,6 +349,8 @@ const live: Layer.Layer<
        ...headers,
      }

+      // Runtime seam: native is an opt-in adapter over @opencode-ai/llm. It
+      // either returns a ready LLMEvent stream or a concrete fallback reason.
      if (flags.experimentalNativeLlm) {
        const native = LLMNativeRuntime.stream({
          model: input.model,
@@ -399,6 +401,8 @@ const live: Layer.Layer<
          "llm.model": input.model.id,
        }),
      )
+      // Default runtime path: AI SDK owns provider execution and tool dispatch;
+      // LLMAISDK.toLLMEvents below normalizes fullStream parts for the processor.
      return {
        type: "ai-sdk" as const,
        result: streamText({
@@ -481,6 +485,8 @@ const live: Layer.Layer<

            if (result.type === "native") return result.stream

+            // Adapter seam: both runtimes expose the same LLMEvent stream. Native
+            // already returns one; AI SDK streams are converted here.
            const state = LLMAISDK.adapterState()
            return Stream.fromAsyncIterable(result.result.fullStream, (e) =>
              e instanceof Error ? e : new Error(String(e)),
@@ -504,7 +510,9 @@ export const defaultLayer = Layer.suspend(() =>
    Layer.provide(Config.defaultLayer),
    Layer.provide(Provider.defaultLayer),
    Layer.provide(Plugin.defaultLayer),
-    Layer.provide(LLMClient.layer.pipe(Layer.provide(RequestExecutor.defaultLayer))),
+    Layer.provide(
+      LLMClient.layer.pipe(Layer.provide(Layer.mergeAll(RequestExecutor.defaultLayer, WebSocketExecutor.layer))),
+    ),
    Layer.provide(RuntimeFlags.defaultLayer),
  ),
 )
--- a/packages/opencode/src/session/llm/AGENTS.md
+++ b/packages/opencode/src/session/llm/AGENTS.md
@@ -1,6 +1,6 @@
 # Session LLM Runtime Boundaries

-`../llm.ts` is the opencode session LLM service. It owns opencode concerns: auth, config, model/provider resolution, plugins, permissions, telemetry headers, and runtime selection.
+`../llm.ts` is the opencode session LLM service. It owns opencode concerns: auth, config, model/provider resolution, plugins, permissions, telemetry headers, and runtime selection. It is the only file in this area that should know about the full session request shape.

 This folder contains adapters behind that service boundary:

@@ -8,6 +8,29 @@ This folder contains adapters behind that service boundary:
 - `native-request.ts` converts opencode's normalized session input into a native `@opencode-ai/llm` `LLMRequest`. It does not execute requests.
 - `native-runtime.ts` is the opt-in native runtime adapter. It decides whether a selected model is supported, builds the native request, bridges opencode tools into native executable tools, and delegates transport to `LLMClient` / `RequestExecutor`.

+## File Structure
+
+```txt
+src/session/
+  llm.ts                    session-owned orchestration and runtime selection
+  llm/
+    AGENTS.md               boundary notes for the adapter layer
+    ai-sdk.ts               AI SDK fullStream -> @opencode-ai/llm LLMEvent adapter
+    native-request.ts       opencode/AI SDK-shaped input -> @opencode-ai/llm LLMRequest
+    native-runtime.ts       native runtime gate, tool bridge, and LLMClient handoff
+```
+
+Integration points:
+
+- `../llm.ts` imports `LLMClient` from `@opencode-ai/llm/route`; native execution is the only path that calls it directly.
+- `../llm.ts` imports `LLMAISDK` from `./llm/ai-sdk`; the AI SDK path still calls `streamText(...)` locally, then adapts `result.fullStream` into shared `LLMEvent`s.
+- `../llm.ts` imports `LLMNativeRuntime` from `./llm/native-runtime`; this is the runtime-selection seam. Unsupported native requests return a reason and fall back to AI SDK.
+- `native-runtime.ts` imports `LLMNative` from `./native-request`; this keeps request lowering separate from transport and tool execution.
+- `native-request.ts` is the only adapter file that should construct `LLM.request(...)`, `LLM.model(...)`, `Message.*`, `SystemPart`, `ToolCallPart`, `ToolResultPart`, or `ToolDefinition` values from `@opencode-ai/llm`.
+- `ai-sdk.ts` and `native-runtime.ts` both emit `@opencode-ai/llm` `LLMEvent`s so downstream session processing does not care which runtime handled the request.
+
+Keep new integration code on one of these seams. Avoid importing session services into `native-request.ts`; pass normalized data through `RequestInput` instead.
+
 ## Runtime selection

 Both runtimes converge on the same `LLMEvent` stream consumed by the session processor. The gate is per-request: a single session can route some calls through native and fall back for others.
@@ -63,5 +86,5 @@ Safety boundary:

 - AI SDK remains the default.
 - `OPENCODE_EXPERIMENTAL_NATIVE_LLM=true` or the umbrella `OPENCODE_EXPERIMENTAL=true` opts in. Native is not a global replacement.
- Native execution currently runs only for OpenAI-compatible Responses models exposed through `@ai-sdk/openai`: direct `openai` API-key auth and console-managed `opencode`/Zen API-key config.
+- Native execution currently supports OpenAI, opencode-managed OpenAI-compatible, and Anthropic API-key paths backed by `@ai-sdk/openai`, `@ai-sdk/openai-compatible`, or `@ai-sdk/anthropic` catalog entries.
 - Unsupported providers, OpenAI OAuth, and missing API-key cases fall back to AI SDK.
--- a/packages/opencode/src/session/llm/native-request.ts
+++ b/packages/opencode/src/session/llm/native-request.ts
@@ -1,6 +1,14 @@
 import type { JsonSchema, LLMRequest, ProviderMetadata } from "@opencode-ai/llm"
 import { LLM, Message, SystemPart, ToolCallPart, ToolDefinition, ToolResultPart } from "@opencode-ai/llm"
-import "@opencode-ai/llm/providers"
+import {
+  AmazonBedrock,
+  Anthropic,
+  Azure,
+  Google,
+  OpenAI,
+  OpenAICompatible,
+  OpenRouter,
+} from "@opencode-ai/llm/providers"
 import type { ModelMessage } from "ai"
 import type { Provider } from "@/provider/provider"
 import { isRecord } from "@/util/record"
@@ -26,24 +34,6 @@ export type RequestInput = {
  readonly headers?: Record<string, string>
 }

-const DEFAULT_BASE_URL: Record<string, string> = {
-  "@ai-sdk/openai": "https://api.openai.com/v1",
-  "@ai-sdk/anthropic": "https://api.anthropic.com/v1",
-  "@ai-sdk/google": "https://generativelanguage.googleapis.com/v1beta",
-  "@ai-sdk/amazon-bedrock": "https://bedrock-runtime.us-east-1.amazonaws.com",
-  "@openrouter/ai-sdk-provider": "https://openrouter.ai/api/v1",
-}
-
-const ROUTE: Record<string, string> = {
-  "@ai-sdk/openai": "openai-responses",
-  "@ai-sdk/azure": "azure-openai-responses",
-  "@ai-sdk/anthropic": "anthropic-messages",
-  "@ai-sdk/google": "gemini",
-  "@ai-sdk/amazon-bedrock": "bedrock-converse",
-  "@ai-sdk/openai-compatible": "openai-compatible-chat",
-  "@openrouter/ai-sdk-provider": "openrouter",
-}
-
 const providerMetadata = (value: unknown): ProviderMetadata | undefined => {
  if (!isRecord(value)) return undefined
  const result = Object.fromEntries(
@@ -147,33 +137,46 @@ const generation = (input: RequestInput) => {
  return Object.values(result).some((value) => value !== undefined) ? result : undefined
 }

-const baseURL = (model: Provider.Model) => {
-  if (model.api.url) return model.api.url
-  const fallback = DEFAULT_BASE_URL[model.api.npm]
-  if (fallback) return fallback
+const baseURL = (input: Provider.Model | RequestInput) =>
+  "model" in input ? (input.baseURL ?? (input.model.api.url || undefined)) : input.api.url || undefined
+
+const requireBaseURL = (model: Provider.Model, url: string | undefined) => {
+  if (url) return url
  throw new Error(`Native LLM request adapter requires a base URL for ${model.providerID}/${model.id}`)
 }

 export const model = (input: Provider.Model | RequestInput, headers?: Record<string, string>) => {
  const model = "model" in input ? input.model : input
-  const route = ROUTE[model.api.npm]
-  if (!route) throw new Error(`Native LLM request adapter does not support provider package ${model.api.npm}`)
-  return LLM.model({
-    id: model.api.id,
-    provider: model.providerID,
-    route,
-    baseURL: "model" in input && input.baseURL ? input.baseURL : baseURL(model),
-    apiKey: "model" in input ? input.apiKey : undefined,
+  const url = baseURL(input)
+  const options = {
+    ...("model" in input && input.apiKey ? { apiKey: input.apiKey } : {}),
+    ...(url ? { baseURL: url } : {}),
    headers: Object.keys({ ...model.headers, ...headers }).length === 0 ? undefined : { ...model.headers, ...headers },
    limits: {
      context: model.limit.context,
      output: model.limit.output,
    },
-  })
+  }
+  if (model.api.npm === "@ai-sdk/openai") return OpenAI.configure(options).responses(model.api.id)
+  if (model.api.npm === "@ai-sdk/azure")
+    return Azure.configure({ ...options, baseURL: requireBaseURL(model, url) }).responses(model.api.id)
+  if (model.api.npm === "@ai-sdk/anthropic") return Anthropic.configure(options).model(model.api.id)
+  if (model.api.npm === "@ai-sdk/google") return Google.configure(options).model(model.api.id)
+  if (model.api.npm === "@ai-sdk/amazon-bedrock") return AmazonBedrock.configure(options).model(model.api.id)
+  if (model.api.npm === "@ai-sdk/openai-compatible")
+    return OpenAICompatible.configure({
+      ...options,
+      provider: String(model.providerID),
+      baseURL: requireBaseURL(model, url),
+    }).model(model.api.id)
+  if (model.api.npm === "@openrouter/ai-sdk-provider") return OpenRouter.configure(options).model(model.api.id)
+  throw new Error(`Native LLM request adapter does not support provider package ${model.api.npm}`)
 }

 export const request = (input: RequestInput) => {
  const converted = messages(input.messages)
+  // This is the only native adapter boundary that should construct canonical
+  // @opencode-ai/llm request objects from opencode's session/AI SDK-shaped data.
  return LLM.request({
    model: model(input, input.headers),
    system: [...(input.system ?? []).map(SystemPart.make), ...converted.system],
--- a/packages/opencode/src/session/llm/native-runtime.ts
+++ b/packages/opencode/src/session/llm/native-runtime.ts
@@ -41,8 +41,8 @@ export function status(input: Pick<StreamInput, "model" | "provider" | "auth">):
  if (providerID !== "openai" && providerID !== "anthropic" && !providerID.startsWith("opencode"))
    return { type: "unsupported", reason: "provider is not openai, opencode, or anthropic" }
  const npm = input.model.api.npm
-  if (npm !== "@ai-sdk/openai" && npm !== "@ai-sdk/anthropic")
-    return { type: "unsupported", reason: "provider package is not OpenAI or Anthropic" }
+  if (npm !== "@ai-sdk/openai" && npm !== "@ai-sdk/openai-compatible" && npm !== "@ai-sdk/anthropic")
+    return { type: "unsupported", reason: "provider package is not OpenAI, OpenAI-compatible, or Anthropic" }
  if (input.auth?.type === "oauth") return { type: "unsupported", reason: "OAuth auth is not supported" }

  const apiKey = typeof input.provider.options.apiKey === "string" ? input.provider.options.apiKey : input.provider.key
@@ -59,6 +59,8 @@ export function stream(input: StreamInput): StreamResult {
  const current = status(input)
  if (current.type === "unsupported") return current

+  // Integration point with @opencode-ai/llm: native-request lowers session data
+  // into an LLMRequest, then LLMClient handles route selection and transport.
  return {
    ...current,
    stream: input.llmClient.stream({
@@ -99,6 +101,8 @@ export function nativeTools(tools: Record<string, Tool>, input: Pick<StreamInput
  return Object.fromEntries(
    Object.entries(tools).map(([name, item]) => [
      name,
+      // Tool execution remains opencode-owned. The native runtime only adapts
+      // the @opencode-ai/llm tool call back into the AI SDK Tool.execute shape.
      nativeTool({
        description: item.description ?? "",
        jsonSchema: nativeSchema(item.inputSchema),
--- a/packages/opencode/src/session/processor.ts
+++ b/packages/opencode/src/session/processor.ts
@@ -278,9 +278,11 @@ export const layer = Layer.effect(
        return { call: ctx.toolcalls[input.id], part }
      })

-      const isFilePart = Schema.is(MessageV2.FilePart)
+      const isFilePart = (value: unknown): value is MessageV2.FilePart => Schema.is(MessageV2.FilePart)(value)

-      const toolResultOutput = (value: Extract<StreamEvent, { type: "tool-result" }>) => {
+      const toolResultOutput = (
+        value: Extract<StreamEvent, { type: "tool-result" }>,
+      ): { title: string; metadata: Record<string, any>; output: string; attachments?: MessageV2.FilePart[] } => {
        if (isRecord(value.result.value) && typeof value.result.value.output === "string") {
          return {
            title: typeof value.result.value.title === "string" ? value.result.value.title : value.name,
--- a/packages/opencode/test/server/httpapi-event-diagnostics.test.ts
+++ b/packages/opencode/test/server/httpapi-event-diagnostics.test.ts
@@ -56,11 +56,11 @@ afterEach(async () => {
 })

 const inApp = <A, E>(eff: Effect.Effect<A, E, AppServices>) =>
-  Effect.flatMap(InstanceRef, (ctx) =>
-    ctx
-      ? Effect.promise(() => AppRuntime.runPromise(eff.pipe(Effect.provideService(InstanceRef, ctx))))
-      : Effect.die("InstanceRef not provided in test scope"),
-  )
+  Effect.gen(function* () {
+    const ctx = yield* InstanceRef
+    if (!ctx) return yield* Effect.die("InstanceRef not provided in test scope")
+    return yield* Effect.promise(() => AppRuntime.runPromise(eff.pipe(Effect.provideService(InstanceRef, ctx))))
+  })

 const publishConnected = inApp(Bus.Service.use((svc) => svc.publish(ServerEvent.Connected, {})))

@@ -112,7 +112,7 @@ const readNextEvent = (reader: ReadableStreamDefaultReader<Uint8Array>) =>
      if (result.done || !result.value) return Effect.fail(new Error("event stream closed"))
      const frames = decodeFrame(result.value)
      if (frames.length === 0) return Effect.fail(new Error("empty SSE frame"))
-      return Effect.succeed(frames[0]!)
+      return Effect.succeed(frames[0])
    }),
  )

@@ -186,8 +186,7 @@ describe("/event SSE delivery diagnostics", () => {

        const collected = yield* collectUntilEvent(reader, isPartUpdated)
        const updated = collected.find(isPartUpdated)
-        expect(updated).toBeDefined()
-        expect((updated as SseEvent).properties.part.id).toBe(partID)
+        expect(updated?.properties.part.id).toBe(partID)
      }),
    { git: true, config: { formatter: false, lsp: false } },
  )
@@ -217,7 +216,7 @@ describe("/event SSE delivery diagnostics", () => {
          }),
        )
        expect(event.type).toBe(MessageV2.Event.PartUpdated.type)
-        expect((event.properties as { part: { id: string } }).part.id).toBe(partID)
+        expect(event.properties).toMatchObject({ part: { id: partID } })
      }),
    { git: true, config: { formatter: false, lsp: false } },
  )
--- a/packages/opencode/test/session/llm-native-recorded.test.ts
+++ b/packages/opencode/test/session/llm-native-recorded.test.ts
@@ -13,7 +13,7 @@ import { Provider } from "@/provider/provider"
 import { ModelID, ProviderID } from "@/provider/schema"
 import { Filesystem } from "@/util/filesystem"
 import { LLMEvent, LLMResponse } from "@opencode-ai/llm"
-import { LLMClient, RequestExecutor } from "@opencode-ai/llm/route"
+import { LLMClient, RequestExecutor, WebSocketExecutor } from "@opencode-ai/llm/route"
 import { RuntimeFlags } from "@/effect/runtime-flags"
 import type { Agent } from "../../src/agent/agent"
 import { LLM } from "../../src/session/llm"
@@ -137,7 +137,7 @@ async function loadFixture(providerID: string, modelID: string) {
 function recordedNativeLLMLayer(spec: ProviderSpec) {
  // Only the HTTP client is recorded; RequestExecutor and the opencode LLM stack remain real.
  const recordedClient = LLMClient.layer.pipe(
-    Layer.provide(RequestExecutor.layer),
+    Layer.provide(Layer.mergeAll(RequestExecutor.layer, WebSocketExecutor.layer)),
    Layer.provide(
      HttpRecorder.recordingLayer(spec.cassette, {
        mode: shouldRecord ? "record" : "replay",
--- a/packages/opencode/test/session/llm-native.test.ts
+++ b/packages/opencode/test/session/llm-native.test.ts
@@ -1,8 +1,8 @@
 import { describe, expect, test } from "bun:test"
 import { ToolFailure } from "@opencode-ai/llm"
-import { LLMClient, RequestExecutor } from "@opencode-ai/llm/route"
+import { LLMClient, RequestExecutor, WebSocketExecutor } from "@opencode-ai/llm/route"
 import { jsonSchema, tool, type ModelMessage } from "ai"
-import { Effect } from "effect"
+import { Effect, Layer } from "effect"
 import { LLMNative } from "@/session/llm/native-request"
 import { LLMNativeRuntime } from "@/session/llm/native-runtime"
 import type { Provider } from "@/provider/provider"
@@ -138,16 +138,16 @@ describe("session.llm-native.request", () => {
    expect(request.model).toMatchObject({
      id: "gpt-5-mini",
      provider: "openai",
-      route: "openai-responses",
-      baseURL: "https://api.openai.com/v1",
-      headers: {
-        "x-model": "model-header",
-        "x-request": "request-header",
-      },
-      limits: {
-        context: 128_000,
-        output: 32_000,
-      },
+      route: { id: "openai-responses" },
+    })
+    expect(request.model.route.endpoint.baseURL).toBe("https://api.openai.com/v1")
+    expect(request.model.route.defaults.headers).toEqual({
+      "x-model": "model-header",
+      "x-request": "request-header",
+    })
+    expect(request.model.route.defaults.limits).toMatchObject({
+      context: 128_000,
+      output: 32_000,
    })
    expect(request.system).toEqual([
      { type: "text", text: "agent system" },
@@ -211,29 +211,50 @@ describe("session.llm-native.request", () => {
    ])
  })

-  test("selects native routes from existing provider packages", () => {
-    expect(
-      LLMNative.model({ ...baseModel, api: { ...baseModel.api, url: "", npm: "@ai-sdk/anthropic" } }),
-    ).toMatchObject({
-      route: "anthropic-messages",
-      baseURL: "https://api.anthropic.com/v1",
+  test("selects native request routes for provider packages", () => {
+    const openai = LLMNative.model({
+      model: { ...baseModel, api: { ...baseModel.api, url: "", npm: "@ai-sdk/openai" } },
+      apiKey: "test-key",
+      messages: [],
    })
-    expect(LLMNative.model({ ...baseModel, api: { ...baseModel.api, url: "", npm: "@ai-sdk/google" } })).toMatchObject({
-      route: "gemini",
-      baseURL: "https://generativelanguage.googleapis.com/v1beta",
+    expect(openai.route.id).toBe("openai-responses")
+    expect(openai.route.endpoint.baseURL).toBe("https://api.openai.com/v1")
+
+    const anthropic = LLMNative.model({
+      model: { ...baseModel, api: { ...baseModel.api, url: "", npm: "@ai-sdk/anthropic" } },
+      apiKey: "test-key",
+      messages: [],
    })
-    expect(
-      LLMNative.model({ ...baseModel, api: { ...baseModel.api, npm: "@ai-sdk/openai-compatible" } }),
-    ).toMatchObject({
-      route: "openai-compatible-chat",
-      baseURL: "https://api.openai.com/v1",
+    expect(anthropic.route.id).toBe("anthropic-messages")
+    expect(anthropic.route.endpoint.baseURL).toBe("https://api.anthropic.com/v1")
+
+    const google = LLMNative.model({
+      model: { ...baseModel, api: { ...baseModel.api, url: "", npm: "@ai-sdk/google" } },
+      apiKey: "test-key",
+      messages: [],
    })
-    expect(
-      LLMNative.model({ ...baseModel, api: { ...baseModel.api, url: "", npm: "@openrouter/ai-sdk-provider" } }),
-    ).toMatchObject({
-      route: "openrouter",
-      baseURL: "https://openrouter.ai/api/v1",
+    expect(google.route.id).toBe("gemini")
+    expect(google.route.endpoint.baseURL).toBe("https://generativelanguage.googleapis.com/v1beta")
+
+    const compatible = LLMNative.model({
+      model: {
+        ...baseModel,
+        providerID: ProviderID.make("opencode"),
+        api: { ...baseModel.api, url: "https://ai.example.test/v1", npm: "@ai-sdk/openai-compatible" },
+      },
+      apiKey: "test-key",
+      messages: [],
    })
+    expect(compatible.route.id).toBe("openai-compatible-chat")
+    expect(compatible.route.endpoint.baseURL).toBe("https://ai.example.test/v1")
+
+    const openrouter = LLMNative.model({
+      model: { ...baseModel, api: { ...baseModel.api, url: "", npm: "@openrouter/ai-sdk-provider" } },
+      apiKey: "test-key",
+      messages: [],
+    })
+    expect(openrouter.route.id).toBe("openrouter")
+    expect(openrouter.route.endpoint.baseURL).toBe("https://openrouter.ai/api/v1")
  })

  test("fails fast for unsupported provider packages", () => {
@@ -260,6 +281,20 @@ describe("session.llm-native.request", () => {
      type: "supported",
      apiKey: "test-openai-key",
    })
+    expect(
+      LLMNativeRuntime.status({
+        model: {
+          ...baseModel,
+          providerID: ProviderID.make("opencode"),
+          api: { ...baseModel.api, npm: "@ai-sdk/openai-compatible" },
+        },
+        provider: { ...providerInfo, id: ProviderID.make("opencode") },
+        auth: undefined,
+      }),
+    ).toMatchObject({
+      type: "supported",
+      apiKey: "test-openai-key",
+    })
    expect(
      LLMNativeRuntime.status({
        model: { ...baseModel, providerID: ProviderID.make("google") },
@@ -281,7 +316,7 @@ describe("session.llm-native.request", () => {
        provider: providerInfo,
        auth: undefined,
      }),
-    ).toEqual({ type: "unsupported", reason: "provider package is not OpenAI or Anthropic" })
+    ).toEqual({ type: "unsupported", reason: "provider package is not OpenAI, OpenAI-compatible, or Anthropic" })

    expect(
      LLMNativeRuntime.status({
@@ -382,12 +417,16 @@ describe("session.llm-native.request", () => {
      LLMClient.prepare(
        LLMNative.request({
          model: baseModel,
+          apiKey: "test-openai-key",
          messages: [{ role: "user", content: "hello" }],
          providerOptions: { openai: { store: false } },
          maxOutputTokens: 512,
          headers: { "x-request": "request-header" },
        }),
-      ).pipe(Effect.provide(LLMClient.layer), Effect.provide(RequestExecutor.defaultLayer)),
+      ).pipe(
+        Effect.provide(LLMClient.layer),
+        Effect.provide(Layer.mergeAll(RequestExecutor.defaultLayer, WebSocketExecutor.layer)),
+      ),
    )

    expect(prepared).toMatchObject({
--- a/packages/opencode/test/session/llm.test.ts
+++ b/packages/opencode/test/session/llm.test.ts
@@ -8,7 +8,7 @@ import { makeRuntime } from "../../src/effect/run-service"
 import { InstanceRef } from "../../src/effect/instance-ref"
 import { LLM } from "../../src/session/llm"
 import type { InstanceContext } from "../../src/project/instance-context"
-import { LLMClient, RequestExecutor } from "@opencode-ai/llm/route"
+import { LLMClient, RequestExecutor, WebSocketExecutor } from "@opencode-ai/llm/route"
 import { Auth } from "@/auth"
 import { Config } from "@/config/config"
 import { Provider } from "@/provider/provider"
@@ -82,7 +82,7 @@ function llmLayerWithExecutor(executor: Layer.Layer<RequestExecutor.Service>, fl
    Layer.provide(Config.defaultLayer),
    Layer.provide(Provider.defaultLayer),
    Layer.provide(Plugin.defaultLayer),
-    Layer.provide(LLMClient.layer.pipe(Layer.provide(executor))),
+    Layer.provide(LLMClient.layer.pipe(Layer.provide(Layer.mergeAll(executor, WebSocketExecutor.layer)))),
    Layer.provide(RuntimeFlags.layer(flags)),
  )
 }
@@ -1975,54 +1975,45 @@ describe("session.llm.stream", () => {
        const body = capture.body

        expect(capture.url.pathname.endsWith("/messages")).toBe(true)
-        expect(body.messages).toStrictEqual([
+        const messages = body.messages as Array<{ role: string; content: Array<Record<string, unknown>> }>
+        expect(messages[0]?.role).toBe("user")
+        expect(messages[0]?.content[0]).toMatchObject({
+          type: "text",
+          text: "Can you check whether there are any PDF files in my home directory?",
+        })
+        expect(messages.some((message) => message.content.some((part) => "cache_control" in part))).toBe(true)
+        const toolUseIndex = messages.findIndex((message) => message.content.some((part) => part.type === "tool_use"))
+        expect(toolUseIndex).toBeGreaterThan(0)
+        expect(messages[toolUseIndex].role).toBe("assistant")
+        expect(messages[toolUseIndex].content.filter((part) => part.type === "tool_use")).toMatchObject([
          {
-            role: "user",
-            content: [{ type: "text", text: "Can you check whether there are any PDF files in my home directory?" }],
+            type: "tool_use",
+            id: "toolu_01N8mDEzG8DSTs7UPHFtmgCT",
+            name: "read",
+            input: { filePath: "/root" },
          },
          {
-            role: "assistant",
-            content: [
-              {
-                type: "text",
-                text: "I checked your home directory and looked for PDF files.",
-              },
-              {
-                type: "tool_use",
-                id: "toolu_01N8mDEzG8DSTs7UPHFtmgCT",
-                name: "read",
-                input: { filePath: "/root" },
-              },
-              {
-                type: "tool_use",
-                id: "toolu_01APxrADs7VozN8uWzw9WwHr",
-                name: "glob",
-                input: { pattern: "**/*.pdf", path: "/root" },
-                cache_control: {
-                  type: "ephemeral",
-                },
-              },
-            ],
-          },
-          {
-            role: "user",
-            content: [
-              {
-                type: "tool_result",
-                tool_use_id: "toolu_01N8mDEzG8DSTs7UPHFtmgCT",
-                content: "<path>/root</path>",
-              },
-              {
-                type: "tool_result",
-                tool_use_id: "toolu_01APxrADs7VozN8uWzw9WwHr",
-                content: "No files found",
-                cache_control: {
-                  type: "ephemeral",
-                },
-              },
-            ],
+            type: "tool_use",
+            id: "toolu_01APxrADs7VozN8uWzw9WwHr",
+            name: "glob",
+            input: { pattern: "**/*.pdf", path: "/root" },
          },
        ])
+        expect(messages[toolUseIndex + 1]).toMatchObject({
+          role: "user",
+          content: [
+            {
+              type: "tool_result",
+              tool_use_id: "toolu_01N8mDEzG8DSTs7UPHFtmgCT",
+              content: "<path>/root</path>",
+            },
+            {
+              type: "tool_result",
+              tool_use_id: "toolu_01APxrADs7VozN8uWzw9WwHr",
+              content: "No files found",
+            },
+          ],
+        })
      },
    })
  })
--- a/packages/ui/src/components/message-part-text.ts
+++ b/packages/ui/src/components/message-part-text.ts
@@ -0,0 +1,3 @@
+export function readPartText(accum: Record<string, string> | undefined, part: { id: string; text?: string }): string {
+  return (accum?.[part.id] ?? part.text ?? "").trim()
+}
--- a/packages/ui/src/components/message-part.test.ts
+++ b/packages/ui/src/components/message-part.test.ts
@@ -0,0 +1,28 @@
+import { describe, expect, test } from "bun:test"
+import { readPartText } from "./message-part-text"
+
+describe("readPartText", () => {
+  test("returns empty string when accum is undefined and part text is undefined", () => {
+    expect(readPartText(undefined, { id: "part_1" })).toBe("")
+  })
+
+  test("returns trimmed part text when accum is undefined", () => {
+    expect(readPartText(undefined, { id: "part_1", text: "  hello  " })).toBe("hello")
+  })
+
+  test("prefers accum value over part text when accum has a hit", () => {
+    expect(readPartText({ part_1: "  from accum  " }, { id: "part_1", text: "from part" })).toBe("from accum")
+  })
+
+  test("falls back to part text when accum misses", () => {
+    expect(readPartText({ other_part: "ignored" }, { id: "part_1", text: "  from part  " })).toBe("from part")
+  })
+
+  test("returns empty string for whitespace-only text", () => {
+    expect(readPartText(undefined, { id: "part_1", text: "   \n\t  " })).toBe("")
+  })
+
+  test("trims leading and trailing whitespace", () => {
+    expect(readPartText(undefined, { id: "part_1", text: "\n  body  \n" })).toBe("body")
+  })
+})
--- a/packages/ui/src/components/message-part.tsx
+++ b/packages/ui/src/components/message-part.tsx
@@ -57,6 +57,7 @@ import { patchFiles } from "./apply-patch-file"
 import { animate } from "motion"
 import { useLocation } from "@solidjs/router"
 import { attached, inline, kind } from "./message-file"
+import { readPartText } from "./message-part-text"

 async function writeClipboard(text: string): Promise<boolean> {
  const body = typeof document === "undefined" ? undefined : document.body
@@ -1497,7 +1498,7 @@ PART_MAPPING["text"] = function TextPartDisplay(props) {
  const streaming = createMemo(
    () => props.message.role === "assistant" && typeof (props.message as AssistantMessage).time.completed !== "number",
  )
-  const text = () => (data.store.part_text_accum_delta?.[part().id] ?? part().text ?? "").trim()
+  const text = () => readPartText(data.store.part_text_accum_delta, part())
  const isLastTextPart = createMemo(() => {
    const last = (data.store.part?.[props.message.id] ?? [])
      .filter((item): item is TextPart => item?.type === "text" && !!item.text?.trim())
@@ -1563,7 +1564,7 @@ PART_MAPPING["reasoning"] = function ReasoningPartDisplay(props) {
  const streaming = createMemo(
    () => props.message.role === "assistant" && typeof (props.message as AssistantMessage).time.completed !== "number",
  )
-  const text = () => (data.store.part_text_accum_delta?.[part().id] ?? part().text ?? "").trim()
+  const text = () => readPartText(data.store.part_text_accum_delta, part())

  return (
    <Show when={text()}>