mirror of
https://github.com/pocketpaw/pocketpaw.git
synced 2026-05-19 16:31:15 +00:00
The audit log already records that a decision happened — actor, event,
timestamp. It never recorded *why* that decision was made. This PR
lays the foundation so every proposal carries its reasoning context.
Three pieces:
- `ee/instinct/trace.py`
- `ToolCallRef` — a tool invocation captured during proposal reasoning:
tool name, a stable args fingerprint, a 200-char result preview, and
a duration reading. The hash lets repeated calls dedupe without
losing the first result.
- `ReasoningTrace` — the top-level structure stored inside
`AuditEntry.context["reasoning_trace"]`. Holds the referenced
fabric_query / soul_recall / kb_article IDs, the tool_calls list,
and the prompt version + backend + model metadata needed to
reproduce what the agent saw.
- `FabricObjectSnapshot` — an immutable snapshot of a Fabric object
at decision time, keyed by `(object_id, audit_id)`. The trace only
stores IDs; the snapshot is what lets a compliance reviewer
reproduce the inputs three months later even after the live object
has moved.
- `ee/instinct/trace_collector.py`
- `TraceCollector` is the async context manager PR-B will wrap around
the `instinct_propose` tool. It subscribes to `SystemEvent`s on
enter, aggregates anything tagged `fabric_query`, `soul_recall`,
`kb_inject`, `tool_start`, or `tool_end`/`tool_result` into the
trace, and unsubscribes on exit — even when the body raises.
- Tool calls are deduplicated on `(tool, args_hash)` within a single
trace so repeated lookups stay compact. Reference lists
(fabric_queries, soul_memories, kb_articles) are deduplicated on
exit while preserving first-seen order.
- Unknown event types and malformed payloads are silently ignored.
The collector never fails the surrounding proposal.
- `InstinctStore` adds `record_fabric_snapshot`,
`get_snapshots_for_audit`, and `get_snapshots_for_object` plus a new
`instinct_fabric_snapshots` SQLite table with indexes on both
`audit_id` and `object_id` so hydration and history queries stay
cheap.
Tests: 19 new in `tests/cloud/test_decision_traces.py` covering the
model round-trip, subscribe/unsubscribe on enter and exit (including
exception path), every event-type branch, tool-call dedup and
truncation, malformed-event tolerance, and the snapshot read paths.
Full suite: 3991 passed; ruff clean.
Paired with PR-B (wire collector into `instinct_propose` + hydration
endpoint) and PR-C (Why? drawer in paw-enterprise), landing next.
66 lines
2.3 KiB
Python
66 lines
2.3 KiB
Python
# ee/instinct/trace.py — Decision trace types for the Instinct pipeline.
|
|
# Created: 2026-04-13 (Move 2 PR-A) — Captures the reasoning inputs behind each
|
|
# proposed action so the audit log explains *why*, not just *what*. Paired with
|
|
# TraceCollector (the bus-subscriber context manager) and FabricObjectSnapshot
|
|
# (immutable rows that preserve referenced objects at decision time).
|
|
|
|
from __future__ import annotations
|
|
|
|
from datetime import datetime
|
|
from typing import Any
|
|
|
|
from pydantic import BaseModel, Field
|
|
|
|
from ee.fabric.models import _gen_id
|
|
|
|
|
|
class ToolCallRef(BaseModel):
|
|
"""One tool call captured during proposal reasoning.
|
|
|
|
Stored inside `ReasoningTrace.tool_calls`. `args_hash` is a stable fingerprint
|
|
of the invocation so repeated calls dedupe cleanly; `result_preview` is the
|
|
first 200 chars of the result string so a human can inspect the trace without
|
|
re-running the tool.
|
|
"""
|
|
|
|
tool: str
|
|
args_hash: str
|
|
result_preview: str = ""
|
|
duration_ms: int = 0
|
|
|
|
|
|
class ReasoningTrace(BaseModel):
|
|
"""Full reasoning context that produced a proposed action.
|
|
|
|
Every decision that lands in `AuditEntry.context` under the key
|
|
`reasoning_trace` follows this schema. Reference fields hold IDs only —
|
|
hydrated content is resolved at read time via the `?hydrate=1` endpoint
|
|
(Move 2 PR-B).
|
|
"""
|
|
|
|
fabric_queries: list[str] = Field(default_factory=list)
|
|
soul_memories: list[str] = Field(default_factory=list)
|
|
kb_articles: list[str] = Field(default_factory=list)
|
|
tool_calls: list[ToolCallRef] = Field(default_factory=list)
|
|
prompt_version: str = ""
|
|
backend: str = ""
|
|
model: str = ""
|
|
token_counts: dict[str, int] = Field(default_factory=dict)
|
|
|
|
|
|
class FabricObjectSnapshot(BaseModel):
|
|
"""An immutable snapshot of a Fabric object at the time a decision was made.
|
|
|
|
When the live object later changes (ownership transfer, status update,
|
|
anything), the trace still reproduces what the agent saw. Keyed by
|
|
(object_id, audit_id) so a single query can be referenced by many
|
|
decisions without duplication.
|
|
"""
|
|
|
|
id: str = Field(default_factory=lambda: _gen_id("fos"))
|
|
object_id: str
|
|
audit_id: str
|
|
object_type: str = ""
|
|
snapshot: dict[str, Any] = Field(default_factory=dict)
|
|
created_at: datetime = Field(default_factory=datetime.now)
|