Concepts

What we don't cache and why

Agent state is a trust decision before it is a performance decision. Yig rebuilds what should be auditable and persists only what survives review.

Published 2026-05-16

Every agent has a memory problem, and it has two failure modes.

A stateless agent cannot reason about a quarter. It walks into Q2 with no recollection of the Q1 audit findings, no muscle memory for which intercompany pair always reconciles late, no track record of the variance bands a particular line carries through a year. It is a new hire every Tuesday morning, which is to say it is a liability the controller has to chaperone.

A fully-stateful agent inherits its own past mistakes silently. The accrual it estimated wrong in March is now part of its working memory in April. The miscategorised entry from a prior close is the seed for next close’s draft. It learns confidently from material it should have flagged the first time. The reviewer is asked to trust a system whose internal state nobody at the customer can see, audit, or roll back.

Both are wrong answers to the same question. The question is: what does the agent remember, and on whose authority?

This article is for the controller who has been asked to authorise an AI agent inside their close, for the security reviewer who has to vouch for the data handling, and for the CFO reading over both their shoulders. The argument is simple: every cache decision Yig makes is a trust decision dressed as a performance decision. We have designed for trust first.

The two failure modes of agent state

The state problem is not a tooling problem. It is a posture problem.

A stateless run is honest about its lack of context but useless for judgement-intensive work. A controller cannot ask “is this variance unusual” of a system that has never seen the line before. The agent has to know that consulting revenue runs at a 47% gross margin in this entity, that the December reclass moved $1.2M from one cost centre to another, that the FX rate moved 6% in the last week of the period. Without that context, every draft is a first draft.

A fully-stateful run remembers all of that, plus everything it ever guessed wrong about it. There is no audit trail for what is inside the agent’s head. The internal audit lead cannot pull a report on the agent’s working memory. The security reviewer cannot answer the question “what did this system know about us last month, and where is that stored?” If the agent’s beliefs about the entity are wrong, there is no clean rollback. You cannot un-train working memory.

Lifestyle products treat this as an engineering problem and solve it with bigger caches, better embeddings, smarter context windows. Instruments treat it as a governance problem and ask a different question first: which pieces of state belong inside the agent at all, and which belong somewhere the operator can see and review?

State that the operator cannot audit is state the operator should not be asked to trust.

That is the load-bearing claim. Everything else follows from it.

The rebuild contract

The default behaviour of a Yig run is to throw away its working state at the end.

When the controller finishes their close and signs the pack, the agent’s intermediate reasoning, the schedules it built mid-run, the partial drafts it iterated on, the inferences it made about which lines mattered — these do not survive into the next run. The next run starts from the same source data, the same workflow definition, the same yellow flags resolved or unresolved, and rebuilds what it needs.

This sounds wasteful. It is.

It costs compute. It costs latency. A run that could have started from “where we left off” instead starts from “what is true in the customer’s stack right now.” If the close ran for six hours, the next run does not start at hour six. It starts at hour zero and re-derives everything along the way.

The performance cost is real, and we pay it on purpose.

Here is the architectural reason. The customer’s data of record lives in their stack. The general ledger is theirs. The workbook is theirs. The document store is theirs. Everything Yig produces is a function of that data plus the workflow plus the reviewer’s decisions, all of which are visible to the customer. If Yig also persisted derived state — schedules, intermediate calculations, partial drafts — then the customer would be trusting two sources of truth: the one they can see in their own stack, and the one buried inside the agent.

Two sources of truth is one too many. The audit trail says what the data of record was, what the agent did with it, what the reviewer approved. It does not need a third actor with private memory.

Throwing the state away is the load-bearing decision. The compute cost is the price.

This is also why the question “can you make it faster by caching more aggressively” has a structural answer, not a performance answer. We can make it faster. We choose not to, on the runs where caching would put state outside the reviewer’s line of sight. The decision is not “compute is cheap.” The decision is “trust is expensive, and we are not going to spend it for a 30% latency improvement.”

The Bloomberg lineage understands this trade. Bloomberg terminals are not optimised for round-trip latency on a stale quote. They are optimised for being correct about what was true at the moment the operator looked. Yig is optimised for being correct about what was true in the customer’s stack at the moment the agent ran.

What persists, and the test each piece had to pass

The rebuild rule is the default. It has exceptions, and each exception had to earn its place.

There are three categories of state that survive a run. Each one was asked the same question: if this is wrong, can the operator detect that it is wrong, and can the auditor reconstruct what it was at the time of the decision?

Persisted artefact	Why it is kept	What changes if it is wrong
The append-only audit log	The audit log is the record of what the agent did. It exists precisely so that no other state needs to be persisted to reconstruct a run.	If the log is wrong, the reviewer can no longer prove what was drafted, who reviewed, and what was approved. The system loses its defensibility. This is the failure mode the entire architecture is designed against.
Workflow definitions and the resolved decisions inside them	A workflow is a versioned contract between the operator and the agent. The controller authored or accepted these rules; the agent’s job is to follow them.	If the workflow is wrong, the operator can roll back to a prior version. The version is named, dated, and approved. There is no hidden change.
Reviewer-resolved yellow flags within an ongoing close cycle	When a controller answers a yellow flag — “this variance is one-time, do not flag it again this period” — the decision sticks for the period. Otherwise the agent re-asks the same question every run.	If the resolution is wrong, the controller corrects it in the next review. The resolution is logged with the controller’s name and timestamp; it is not the agent’s belief, it is the controller’s instruction.

Notice what is not on this list.

There is no row for “the agent’s learned model of this entity.” There is no row for “embeddings of the customer’s prior closes.” There is no row for “summaries the agent built of the controller’s past decisions.” Those would be useful. They would also be state nobody outside the agent can see, audit, or roll back. They fail the test.

Each item on the table satisfies the same three conditions: it was authored or approved by a named operator, it lives somewhere the operator can read it, and a wrong version is visibly wrong. State that does not meet all three conditions does not persist. We do not make exceptions on the grounds that the state would be useful. Useful is not the threshold. Auditable is.

If a piece of state cannot be reviewed, it is not allowed to influence the next run.

The one place we don’t trust ourselves

There is one category of state where the rebuild contract is not enough. It is also the one category where we do not allow Yig to be the system of record under any circumstances.

The data that actually describes the business — the journals, the trial balance, the working papers, the intercompany schedules, the close package itself — lives in the customer’s stack. Not in Yig. Not in a Yig-adjacent store. Not in a Yig-managed environment with a separate retention policy. In the customer’s general ledger, the customer’s workbook, the customer’s document store. Where it already lived before Yig arrived. Where it will still live if Yig disappears.

This is the boundary that does not move.

There is a version of this product where it would be tempting to relax this. A Yig-side warehouse of recent closes would let the agent answer “is this variance unusual” faster. A Yig-side document store of past board packs would let the agent draft a new pack with style continuity. A Yig-side cache of reconciled intercompany pairs would let the agent skip work on the next close.

We do not build any of those. The reason is not that they are technically difficult — they are not. The reason is that the controller cannot vouch for state that lives in a vendor’s system. The security reviewer cannot vouch for it either. The first time something goes wrong in a Yig-side store, the customer has to discover the failure inside the vendor’s environment, on the vendor’s timeline, with the vendor’s tooling. That is the wrong shape of an audit response.

The right shape is: the data is yours, the agent reads it, the agent writes drafts back to it, the agent does not keep a copy. When the agent’s view of the world disagrees with the data of record, the data of record wins, every time, by construction.

The data of record is not ours to keep.

This is the place where we do not trust ourselves to be careful enough. Not because the engineering would be hard. Because the trust posture matters more than the engineering convenience. A vendor that holds a copy of the customer’s books has crossed a line that is hard to un-cross. We do not start across that line.

Here is the practical shape of the architecture, drawn at the level a security reviewer can defend:

                      ┌──────────────────────────────┐
                      │  Workflow run begins         │
                      └──────────────┬───────────────┘
                                     │
                                     ▼
        ┌─────────────────────────────────────────────────────┐
        │  Read scoped slice of customer data (read-only)     │
        │  GL · workbook · document store · IC schedules       │
        └─────────────────────────────────────────────────────┘
                                     │
                                     ▼
        ┌─────────────────────────────────────────────────────┐
        │  Agent working state (in-flight only)               │
        │  Schedules · partial drafts · intermediate reasoning │
        │  Discarded at end of run · rebuilt next run          │
        └─────────────────────────────────────────────────────┘
                                     │
                                     ▼
        ┌─────────────────────────────────────────────────────┐
        │  Draft presented to reviewer at a Yig surface       │
        │  Slack DM · Excel sidebar (cell A1:F127) · CLI       │
        │  · Word footnote anchor                              │
        └─────────────────────────────────────────────────────┘
                                     │
                                     ▼
        ┌─────────────────────────────────────────────────────┐
        │  Reviewer action recorded in append-only audit log  │
        │  Persisted · operator-readable · no business content │
        │  retained on the vendor side                         │
        └─────────────────────────────────────────────────────┘
                                     │
                                     ▼
        ┌─────────────────────────────────────────────────────┐
        │  Approved draft written back to customer's stack    │
        │  Customer-owned · authoritative · the data of record │
        └─────────────────────────────────────────────────────┘

The dashed line that nobody draws but that matters: nothing crosses from the agent’s working state into a vendor-side store of business content. The audit log persists. The workflow definition persists. The customer’s data of record persists in the customer’s stack. The agent’s working memory does not.

Cost vs trust

There is a cost calculation here, and the article should be honest about it.

Caching more would be faster. It would feel snappier. It would lower the latency on the second run of a similar workflow. There are vendors who have made the other choice — keep a rich working memory on the vendor side, train on it, condition the next run on it — and they will likely run faster on certain workflows than Yig does. We will not match that latency on those workflows. We are not trying to.

The trade we have made: every cache decision is a vote about who the agent works for.

A cache that lives on the vendor side, that the customer cannot inspect, that the controller cannot point to in the audit log, makes the agent slightly more loyal to the vendor than to the customer. The vendor is the only party that knows what is in it. The vendor is the only party that can change what is in it. The vendor is the only party that benefits from it being durable. Each marginal cache of that shape moves the centre of gravity of the agent toward the vendor and away from the operator.

A rebuild contract — throw the state away, read from the customer’s stack every time, persist only what is auditable — makes the agent loyal to the operator. The state the agent uses to reason is the state the operator can see. The state the agent does not have is the state the operator did not approve.

This is the same choice that separates instruments from lifestyle products. A lifestyle product optimises for the moment. It wants the second run to feel impressive. An instrument optimises for the decade. It wants the audit trail to still hold up when a different auditor with a different framework asks a different question three years from now.

The FP&A lead does not care about the milliseconds we save by caching their last variance commentary. They care about whether the variance commentary on next quarter’s pack is something they can defend to the audit committee. Those two things are not the same product, and the cache decisions tell you which product is being built.

There is a question this leaves open, and we are willing to be wrong on it. The question is whether the rebuild contract scales as the workflows get heavier. A close that involves twelve entities, four currencies, and three weeks of partial drafts is more expensive to rebuild from scratch than it is to resume from a cache. Some day, on some workflow, the cost of doing this correctly may exceed the cost a customer is willing to pay. On that day, the right answer is to ask the customer whether they want to keep paying the trust premium, in plain language, and announce it in /docs/security if the answer changes.

Until that day, we throw the state away.

If we are wrong about the trust premium being worth the cost, we deserve to lose to the vendors who chose differently. If we are right, every close cycle the customer runs is a close cycle they can fully reconstruct, audit, and defend — three years from now, with a different reviewer in the chair, against a question we have not yet been asked.

The cache we did not build is the audit response we will not have to write.