Brings the v0.3.1 integration work into ee so enterprise testing can
run against a single base. Seven feature PRs from dev:
#927 Ripple widget docs via kb-go
#939 IngestAdapter alias + IngestACL
#940 Fleet bundle runtime + Sales Fleet
#947 Fleet installer journal emission
#948 Fleet REST router (templates + install)
#949 get_journal FastAPI dep + fleet router migration
#950 Google Drive SourceAdapter (first zero-copy federation source)
Plus the soul-protocol base-dep promotion (#173 on soul-protocol side)
and the two test fixes that exposed when soul became resolvable in CI.
One conflict resolved: uv.lock regenerated — soul-protocol bumped from
0.3.0 to 0.3.1 (matches the new base-dep pin). All other files merged
cleanly including both halves of ee/ (existing ee-cloud work plus the
new ee/fleet + ee/journal_dep from the integration track).
Full suite green: 4099 passed, 7 skipped.
* feat(fleet): installable bundle runtime + Sales Fleet template
A FleetTemplate is a YAML manifest naming a soul template + pocket +
connector list + scope tags. install_fleet() orchestrates the install
using existing primitives — SoulFactory, ConnectorRegistry, pocket
service — without introducing new runtime concepts. Sales Fleet ships
as the first bundled example.
What landed:
- ee/fleet/models.py
- FleetConnector — name + config + optional flag.
- FleetTemplate — name, display_name, version, soul_template ref,
pocket name + widgets, connector list, scope list, open metadata.
- FleetInstallStep + FleetInstallReport — per-step status
(succeeded/skipped/failed) so partial installs are observable
without re-running the whole pipeline.
- ee/fleet/installer.py
- load_fleet(path_or_name) reads YAML/JSON or resolves a bundled
name (sales-fleet → src/pocketpaw/fleet_templates/sales-fleet.yaml).
- install_fleet(fleet, *, soul_factory, connector_registry,
pocket_creator) — pure orchestrator. Each external dep is
injectable so tests substitute fakes; production callers pass
the real services.
- Each step is wrapped: per-step exceptions are caught + logged
+ marked as failed in the report so install never crashes the
runtime.
- Optional connectors get "skipped" when missing; required get
"failed" so admins see what to fix.
- src/pocketpaw/fleet_templates/sales-fleet.yaml
- Arrow soul + Pipeline pocket + HubSpot + Gong connectors,
scoped org:sales:*. Connectors marked optional so the demo
install works without external API keys.
Tests: 15 new in tests/cloud/test_fleet_installer.py covering:
- YAML + JSON manifest loading + bundled-by-name resolution +
missing-file error
- install_fleet creates soul + pocket + registers connectors
with mocked deps
- Skips pocket cleanly when creator unavailable
- Optional missing connector → skipped, required missing → failed
- Per-step exception is captured in the report
- Returns early on soul creation failure (no orphan pocket)
- Sales Fleet bundled, has Arrow soul + sales scope
- Sales Fleet connectors all optional (demo-friendly)
- Report.succeeded() + failed_steps() helpers
First PR of Move 7 PR-B. PR-C ships the Install Fleet UI.
* style(fleet): ruff auto-fix
* fix(fleet): point PyYAML import error at pocketpaw[soul]
PyYAML is pulled in transitively via pocketpaw[soul] -> soul-protocol[engine].
The error message was pointing at the transitive package, which sent operators
chasing the wrong install command. Point it at the pocketpaw extra that
actually owns the dependency.
---------
Co-authored-by: Prakash-1 <prakash-1@Mac.lan>
Pre-existing work carried in this branch:
- AgentService.create now eagerly materializes an agent's soul on disk
(non-fatal; AgentPool still lazy-inits on failure)
- MessageService.send_message / edit_message / toggle_reaction now use
_require_can_post so view-only group members are blocked from posting,
editing their own messages, and reacting
- New agent-pool helpers (ensure_soul) and e2e API test coverage
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Consolidates the two parallel RBAC frameworks into a single route-level
authorization system and adds the missing pieces needed for a managed-onboarding
pilot: an endpoint to list pending invites, sane visibility defaults for
pockets/groups so invited teammates actually see shared work, and audit
logging on every denial.
RBAC consolidation
- New ACTIONS table (src/pocketpaw/ee/guards/actions.py) is the single source
of truth for every guarded action → (minimum role, stable deny code).
Covers workspace, group, message, pocket, agent, session, KB, invite,
and billing — 32 rows total.
- New audit helpers (src/pocketpaw/ee/guards/audit.py): log_denial() +
log_privileged_action() backed by the existing append-only audit log.
- New FastAPI deps: require_action(), require_action_any_workspace(),
require_group_action(), require_membership(), require_agent_owner_or_admin(),
require_pocket_edit(), require_pocket_owner().
- Route-level enforcement across workspace/chat/pockets/agents/sessions/kb
routers; permission checks removed from service bodies where now redundant.
- Group role model extended to 3 tiers (owner > admin > member), with per-member
override via Group.member_roles. Admin tier enforced via resolve_group_role()
and the updated _require_group_admin helper.
- Legacy ee/cloud/shared/permissions.py deleted; legacy require_role shim
removed from ee/cloud/shared/deps.py.
- Matrix test (tests/cloud/test_rbac_matrix.py) iterates every ACTIONS entry
across every peer role, verifying both allow and deny paths and the exact
Forbidden.code. Meta-test enforces coverage.
- Redundant tests/cloud/test_permissions.py removed.
Invite list endpoint
- GET /workspaces/{workspace_id}/invites returns all pending invites,
admin-only via require_action("invite.create"). Wires the admin UX for
listing + copy-link + revoke.
Visibility defaults
- Pocket.visibility default flipped "private" → "workspace" so new pockets
are visible to all workspace members out of the box. Owners can tighten
per-pocket via the Share tab.
- Group.type default flipped "public" → "private" so new groups are
invite-only. Public channels remain explicit via type="channel" or "public".
- CreateGroupRequest and UpdateGroupRequest updated; GroupService.update_group
now supports type changes. list_groups() treats "channel" same as "public"
for workspace-wide readability.
- Existing tests updated to match new defaults.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Builds on PR-A's data layer. This PR is the HTTP face of the Paw Print
widget: spec serving for the embedded client, owner-authed CRUD for
widget management, and the inbound event pipeline that turns customer
clicks into Fabric objects.
Endpoints (all mounted at /paw-print/):
Owner surface — require `X-Paw-Print-Token` header matching the
widget's current access_token:
- POST /widgets
- GET /widgets (filter by pocket_id / owner)
- GET /widgets/{id}
- PATCH /widgets/{id}/spec
- POST /widgets/{id}/rotate-token
- DELETE /widgets/{id}
- GET /widgets/{id}/events
Public surface — CORS-gated per widget, no token:
- GET /paw-print/spec/{widget_id}
- POST /paw-print/events/{widget_id}
Security ordering on the ingest hot path:
1. Widget exists (404 otherwise).
2. `Origin` header is in the widget's allowed_domains (403 otherwise).
Empty allowlist disables the check — fine for local dev, set it in
production. Host match ignores ports and paths.
3. Payload size <= MAX_PAYLOAD_BYTES (4KB). 413 otherwise.
4. Rate limits — both the overall-per-minute and per-customer-per-
minute caps. 429 otherwise.
5. Guardian (the existing input-sanitization layer) screens the JSON
payload. When ee/ lacks the guardian backend the ingest quietly
proceeds — telemetry must never become a dependency that breaks
customer interactions.
After that the event is appended to the widget's log. When the widget
has an `event_mapping` for the event type, `{{ placeholder }}` fields
are interpolated against `{payload, customer_ref}` and a Fabric object
is created via the existing store. Whole-placeholder templates return
the raw value so non-string payloads survive; mixed strings fall back
to stringified substitution. Missing paths resolve to empty string in
mixed mode.
Spec serving mirrors the same origin check and echoes the matched
Origin back as `Access-Control-Allow-Origin` along with `Vary: Origin`.
Tests: 19 new in `tests/cloud/test_paw_print_ingest.py`:
- widget CRUD with token auth (create, get, rotate, delete,
list-events) — 401 without the header, 204 on delete, 404 after
- spec endpoint across allowed / disallowed / missing origin + empty
allowlist opens up
- event ingest happy path + origin rejection + oversized payload +
per-customer rate limit firing on the 4th call + Guardian rejection
returning accepted=false + Fabric mapping creating an object
- `_interpolate` across full-placeholder, mixed string, missing-path
Full suite: 3991 passed. Ruff clean.
PR-C (the vanilla-JS widget bundle) and PR-D (the Paw OS admin panel)
land next. Routing is still mounted via `ee.api` — PR-D wires it into
the live app.
Kicks off Move 3 — the Paw Print customer-facing widget. Palantir is
backend-only; their decision loop ends at the operator. Paw OS closes
the loop to the customer: events on a brand-embedded widget land in a
Pocket, Instinct nudges the owner, the approved action flows back to
the widget. This PR is the backend side of that loop.
Minimum scope for this PR:
- `ee/paw_print/models.py`
- `PawPrintBlock` — a tagged-union render primitive (`text` / `image`
/ `list` / `button` / `form` / `divider`). No raw HTML, no arbitrary
JS. Forward-compatible: unknown fields on a block type are ignored.
- `PawPrintListItem`, `PawPrintAction`, `PawPrintFormField` — shared
sub-types.
- `PawPrintSpec` — what the widget bundle fetches and renders.
Validators enforce `<= 64 blocks per spec` and `<= 50 items per list`
so a compromised creator can't blow up the client.
- `PawPrintWidget` — owner-facing model. Validators lowercase +
deduplicate `allowed_domains`, cap the list at 20, enforce positive
rate limits, and mint per-widget access tokens via `secrets`.
- `PawPrintEvent` — inbound signal. Type is required and stripped;
`payload_size()` helps PR-B enforce the 4KB payload cap.
- `PawPrintEventMapping` — how a widget event turns into a Fabric
object (`{{ placeholder }}` interpolation lives in PR-B).
- Public module-level constants re-export the caps so PR-B's ingest
layer reads the same numbers without reaching into privates.
- `ee/paw_print/store.py`
- Async SQLite tables for widgets and events, matching the shape of
`InstinctStore` so the wiring is familiar. Indexes on
`(widget_id, timestamp DESC)` and `(widget_id, customer_ref)` so
PR-B's rate limiter stays cheap under load.
- CRUD for widgets, plus `update_spec` and `rotate_token` — rotation
mints a fresh `pp_tok_…` so a leaked token can be revoked without
tearing down the widget.
- Append-only event log with `recent_events`, `count_events_since`,
and the composite `within_rate_limit(overall_per_min,
per_customer_per_min, customer_ref)` primitive. PR-B's event-ingest
endpoint is a thin wrapper around this.
Tests: 19 new in `tests/cloud/test_paw_print_backend.py` covering
block caps, domain normalization and cap, rate-limit validation,
token format, CRUD round-trip including update + rotation + delete
idempotency, event ordering, window counting, and the
overall-vs-per-customer rate-limit interaction. Full suite: 3991
passed; ruff clean.
No routing, no WS, no Fabric integration yet — those land in PR-B.
PR-A landed the data types. This PR wires them into the proposal path
and adds the read endpoint the UI will consume.
Changes:
- `InstinctStore.propose()` now accepts optional `reasoning_trace` and
`fabric_snapshots` kwargs. The trace is serialized into
`AuditEntry.context["reasoning_trace"]` when present; snapshots are
persisted via `record_fabric_snapshot()` with their `audit_id`
rewritten to the freshly created audit row so the caller doesn't
have to know the audit ID at propose time.
- `POST /instinct/actions` accepts the same two optional fields on the
request body so direct HTTP callers (and the agent tool) can attach
decision inputs at propose time.
- `GET /instinct/audit/{id}` is the new read endpoint. With the default
`hydrate=0` it returns the entry plus the decoded trace if stored.
With `hydrate=1` it also expands the trace's referenced Fabric IDs —
both the immutable snapshots captured at decision time and the live
state of the same objects — so a reviewer can see what the agent saw
and compare against what the object looks like now.
- `FabricQueryTool` emits `SystemEvent(event_type="fabric_query",
data={"object_id": ..., "object_type": ...})` for each returned
object. When a TraceCollector is active (agent-loop wiring — follow
up), the events land in the trace automatically. In every other
context the emission is a no-op: any failure is logged at debug
level and swallowed so tool calls never break because telemetry is
sick.
Response shape:
```
HydratedAuditEntry {
entry: AuditEntry,
reasoning_trace: ReasoningTrace | null,
fabric_snapshots: FabricObjectSnapshot[],
fabric_current: [{ object_id, type_name, properties }]
}
```
`fabric_current` is best-effort — if the ee/ module isn't installed or
an object was deleted, the entry is simply omitted.
Tests: 8 new in `tests/cloud/test_decision_traces_wiring.py` covering
`propose()` with/without trace, snapshot re-keying to the audit row,
the propose endpoint accepting traces, and the hydration endpoint
across hydrate=0, hydrate=1, unknown-audit 404, and trace-less audit.
Full suite: 3991 passed; ruff clean.
Paired with PR-A (#930) and PR-C (paw-enterprise Why? drawer, next).
The audit log already records that a decision happened — actor, event,
timestamp. It never recorded *why* that decision was made. This PR
lays the foundation so every proposal carries its reasoning context.
Three pieces:
- `ee/instinct/trace.py`
- `ToolCallRef` — a tool invocation captured during proposal reasoning:
tool name, a stable args fingerprint, a 200-char result preview, and
a duration reading. The hash lets repeated calls dedupe without
losing the first result.
- `ReasoningTrace` — the top-level structure stored inside
`AuditEntry.context["reasoning_trace"]`. Holds the referenced
fabric_query / soul_recall / kb_article IDs, the tool_calls list,
and the prompt version + backend + model metadata needed to
reproduce what the agent saw.
- `FabricObjectSnapshot` — an immutable snapshot of a Fabric object
at decision time, keyed by `(object_id, audit_id)`. The trace only
stores IDs; the snapshot is what lets a compliance reviewer
reproduce the inputs three months later even after the live object
has moved.
- `ee/instinct/trace_collector.py`
- `TraceCollector` is the async context manager PR-B will wrap around
the `instinct_propose` tool. It subscribes to `SystemEvent`s on
enter, aggregates anything tagged `fabric_query`, `soul_recall`,
`kb_inject`, `tool_start`, or `tool_end`/`tool_result` into the
trace, and unsubscribes on exit — even when the body raises.
- Tool calls are deduplicated on `(tool, args_hash)` within a single
trace so repeated lookups stay compact. Reference lists
(fabric_queries, soul_memories, kb_articles) are deduplicated on
exit while preserving first-seen order.
- Unknown event types and malformed payloads are silently ignored.
The collector never fails the surrounding proposal.
- `InstinctStore` adds `record_fabric_snapshot`,
`get_snapshots_for_audit`, and `get_snapshots_for_object` plus a new
`instinct_fabric_snapshots` SQLite table with indexes on both
`audit_id` and `object_id` so hydration and history queries stay
cheap.
Tests: 19 new in `tests/cloud/test_decision_traces.py` covering the
model round-trip, subscribe/unsubscribe on enter and exit (including
exception path), every event-type branch, tool-call dedup and
truncation, malformed-event tolerance, and the snapshot read paths.
Full suite: 3991 passed; ruff clean.
Paired with PR-B (wire collector into `instinct_propose` + hydration
endpoint) and PR-C (Why? drawer in paw-enterprise), landing next.
Part two of the correction loop. PR-A captured the diff between proposal
and approval; this PR feeds that signal into soul-protocol and makes it
available to the agent on the next draft.
Three pieces:
- `ee/instinct/correction_soul_bridge.py`
- `CorrectionSoulBridge.record(correction, action)` observes the edit as
a soul Interaction with the full patch list in the agent_output so
it enters the episodic tier with a recall-friendly body.
- Counts how many times the same `path` has been edited in this pocket.
On the third match, synthesizes a short deterministic rule (no LLM
call on the hot path) and stores it as a procedural memory at
importance 7.
- Degrades silently when the soul is absent, or when any step of the
soul handoff raises — approval must never fail because the soul is
sick. Corrections still persist to SQLite either way.
- `/approve` endpoint forwards to the bridge after `record_correction`
succeeds. Best-effort, never blocks the response.
- `src/pocketpaw/tools/builtin/instinct_corrections.py`
- New `instinct_corrections` agent tool. The agent calls it with a
`pocket_id` before proposing a new action and gets back a formatted
list of recent edits with their patch paths. Pre-applying those
patterns is the whole point — the draft already reflects the user's
history before it ever reaches the Tray.
- Registered alongside `instinct_propose` / `instinct_pending` /
`instinct_audit` in the enterprise tool block so all 7 agent
backends pick it up.
Tests: 11 new in `tests/cloud/test_correction_soul_bridge.py` covering
the observe payload shape, the 3x-same-path promotion heuristic (fires
exactly once, tracks paths independently), graceful no-soul degradation,
and the tool's formatted output including the empty state. The optional
`soul_protocol` dependency is stubbed via autouse fixture so tests run
in the base dev env. Full suite: 3991 passed, 0 failed.
Paired with PR-A (#928) and PR-C (paw-enterprise UI, next).
The Instinct pipeline already captured what the agent proposed and what
the human ultimately decided. It did not capture the delta in between —
the edits a rep made to a draft before hitting approve. That delta is
the single cheapest learning signal we have, and it was being discarded
every time.
This change adds a first-class Correction record for each edit-then-approve.
What landed:
- `ee/instinct/correction.py`
- `CorrectionPatch` and `Correction` Pydantic models.
- `compute_patches(before, after)` does a structural diff across the
fields a human would actually edit — title, description,
recommendation, category, priority, plus the top-level keys of
`parameters`. `context` is intentionally skipped; it's reasoning
metadata, not action content.
- `summarize_correction(action, patches)` formats a deterministic,
LLM-free recall key for the soul bridge to key off in PR-B.
- `ee/instinct/store.py`
- New `instinct_corrections` SQLite table with indexes on `pocket_id`
and `action_id`.
- `record_correction()` persists the row and logs a `correction_captured`
audit event carrying the patch paths — so the Why? drawer in Move 2
can hydrate the trace without a second table.
- `get_corrections_for_pocket` / `get_corrections_for_action` /
`count_corrections_by_path` for the read paths the soul bridge and
the UI will consume.
- `ee/instinct/router.py`
- `/approve` now accepts an optional `ApproveRequest` body with edited
fields. When edits differ from the stored proposal, the server diffs
the two, persists a `Correction`, and writes the edits back to the
action before transitioning its status to approved.
- Response is wrapped in `ApproveResponse { action, correction | null }`
so callers always know whether learning was captured. The existing
no-body POST still approves unchanged.
- `GET /instinct/corrections?pocket_id=...|action_id=...` exposes the
captured deltas for the UI and the agent tool that lands in PR-B.
Tests: 21 new in `tests/cloud/test_ee_correction.py` covering
compute_patches, summarize_correction, store CRUD, audit side-effect,
the three approve-body cases (unchanged, edited, equal-values), and
the corrections endpoint. Existing instinct suite adjusted for the new
approve response shape. Full pocketPaw suite: 3991 passed, 0 failed.
- Mock GoalParser in deep_work_session tests (was hitting real LLM, 4m10s → 1s)
- Add embedding_provider="hash" to file_memory_fixes tests (was timing out on ollama)
- Fix mock_config fixture in test_remote_access to use tmp_path (race condition under -n auto)
- Update assertions for WEBSOCKET channel now having format hints
- Accept 401 for /api/v1/sessions when ee.cloud router overrides core router
- Exclude tests/cloud and tests/e2e from default pytest run (pymongo hangs)
- Reorganize ee/cloud tests into tests/cloud/, v1 tests into tests/v1/
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace the old soul_path string with structured soul integration fields
(soul_enabled, soul_persona, soul_archetype, soul_values, soul_ocean)
on AgentConfig. Flatten CreateAgentRequest so config fields are top-level
instead of a raw dict, and update the service layer to build AgentConfig
from those flat fields with auto-derived defaults. UpdateAgentRequest
now supports individual config/soul field overrides alongside the
existing bulk config dict path.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Verifies all 6 domain routers (auth, workspace, agents, chat, pockets,
sessions), the license endpoint, WebSocket endpoint, and CloudError
handler mount correctly on a FastAPI app without requiring MongoDB.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replaces Socket.IO with native FastAPI WebSocket management. Handles
connection lifecycle, user-to-connections mapping (multi-tab/device),
message routing to group members, typing indicators with auto-expiry,
and presence tracking with grace period support.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
First of the chat domain tasks — Pydantic schemas for REST
(group CRUD, messages, reactions) and WebSocket (inbound/outbound)
with cursor-based pagination support. 31 tests.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Convert ee/cloud/auth.py monolith into ee/cloud/auth/ package with proper
separation of concerns: core (fastapi-users setup), schemas (Pydantic
request/response models), service (business logic), and router (endpoints).
All existing imports preserved via __init__.py re-exports.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>