Start Excel trial

Concepts

Run your first GL reconciliation

A walkthrough of the GL reconciliation workflow from first invocation to approved output — what happens at each layer and what the reviewer sees.

Published 2026-05-11

This article walks through a GL reconciliation run in Yig Thinker — from invocation to ready-to-ship output. It describes the layers, not the implementation details.

Before starting, you need beta access and a configured deployment. If you do not have access yet, join the waitlist at yig.com/waitlist.

What a GL reconciliation does

A GL reconciliation produces a list of differences for a reviewer, not a single corrected number. The reviewer decides.

A GL reconciliation compares two representations of the same period’s ledger data — typically a source GL system and a downstream system such as a data warehouse, reporting tool, or spreadsheet — and surfaces the differences.

The goal is not to produce a single “correct” number automatically. It is to identify every line where the two sources disagree, explain why each difference might exist, and present those findings to a reviewer.

Step 1 — Invoke the workflow

The first run is a calibration, not a verdict. Sarah is a controller at ACME US. It is day 4 of the Q2 2026 close. She is running her first GL reconciliation against ACME’s warehouse-loaded TB, comparing it line-by-line to the SAP general ledger of record.

From the CLI:

$ yig thinker run gl_recon --period Q2-2026 --entity "ACME US"

From Slack:

@yig-thinker run gl_recon for ACME US Q2

From either surface, the invocation reaches the agent at L2. The planner reads the gl_recon template — the declared sequence of data reads and comparisons — and begins execution.

Step 2 — The agent reads the sources

The reads are scoped to the period and entity specified. Nothing else is read. The planner calls the data connectors declared in the template — for a standard recon, two reads: the primary source (GL system) and the comparison source (warehouse or spreadsheet).

The audit log records each read: which connector, which scope, how many rows. If either read fails — network error, permission denied, source unavailable — the planner halts and records the failure. It does not proceed with partial data.

Step 3 — The planner compares the sources

The comparison runs against the declared tolerance, not a default the agent invented. For each GL line in the primary source, the planner looks for the corresponding line in the comparison source. It records:

  • Lines that match exactly.
  • Lines that are present in one source but not the other.
  • Lines that are present in both but with different amounts.

The comparison result is the raw material for the draft.

Step 4 — The agent produces the draft

The output shape is fixed in the template. The content changes; the shape does not. A standard GL reconciliation draft contains:

  • A summary: lines compared, matched, differences by type.
  • A matched-lines section, collapsed by default and expandable.
  • A differences table: each differing line with amounts, delta, candidate explanation.
  • Yellow flags: lines the agent cannot place.

Sarah’s first draft renders like this in the Slack thread:

gl_recon · ACME US · 2026-Q2 · v1
─────────────────────────────────────────────────────
1,247 lines compared · 1,241 matched · 6 differences · 2 flags

Differences (collapsed; click to expand)
─────────────────────────────────────────────────────
  line_id   primary   compare   delta     candidate
  ─────────────────────────────────────────────────
  GL-0412   42,100    0         42,100    timing (accrual unposted)
  GL-0718   128,400   128,414   -14       rounding (fx conversion)
  GL-0904   0         128,000  -128,000   ⚑ no counterpart found
  GL-1102   58,000    57,200    800       fx rate mismatch
  GL-1244   0         42,500   -42,500    ⚑ post-period entry?
  GL-1247   12,000    11,800    200       rounding (allocation)

Flags requiring judgment
─────────────────────────────────────────────────────
  ⚑ GL-0904 — large variance, no counterpart in primary
  ⚑ GL-1244 — entry visible in warehouse, missing in primary

The draft is presented to the reviewer at whatever surface the workflow was invoked from.

Step 5 — The reviewer acts

The reviewer’s attention belongs on the flagged lines first. The draft renders the same content on every surface; the chrome differs.

Step 5 onWhat Sarah seesHow she responds
CLIThe draft above as formatted text; prompt asks [a]ccept · [r]eject · [c]omment <id>Types c GL-0904 "misposted to DE-001; reclass next cycle" then a to accept
SlackSame draft as a structured message; action buttons under each flagReplies in-thread to each flag; clicks Accept all candidates when done
Excel sidebarSidebar table with rows linked to TB cells; clicking GL-0904 scrolls the workbook to row 904Comments on each flag in the sidebar; staged edits appear as diffs before write-back

On a first run, the reviewer’s order of attention is consistent:

  1. The flagged lines first. These are the lines the agent could not place. Each one needs a human judgment before the run can complete.
  2. The unexplained differences next. Lines with a candidate explanation that does not match Sarah’s expectation of why this line would differ. She can reject the candidate or edit the explanation.
  3. The matched-line spot-check last. Not every cycle; on a first run, expand a random sample of the matched section and verify the agent matched the right rows.

For each yellow-flagged line, Sarah reads the agent’s explanation and makes the call: timing difference, real error, policy item? She records her judgment in a comment. For each difference line with a candidate, she accepts, rejects, or edits it.

When Sarah is satisfied, she approves. The approval is recorded in the audit log: reviewer name, timestamp, and a reference to the version of the draft that was approved.

Step 6 — The output ships

The approved output is the data of record from this point on. It is written to the destination configured for this workflow — a spreadsheet, a reporting system, a close folder, or wherever this team’s reconciled GL data lives.

The audit trail at L3 records the full chain: invocation, data reads, draft, every reviewer action, and the final approved state. Append-only; does not change after the fact.

What you should expect across cycles

The flag rate is not a quality metric. It is a calibration curve.

Cycle 1. On Sarah’s first run, the agent has no context for what normal looks like. It errs toward flagging — expect 8–12 flags on a thousand-line TB. Sarah’s comments on each become the deployment’s record of normal.

Cycle 3. By August, the agent has seen three cycles. Routine timing-difference flags have collapsed into the candidate-explanation category. Expect 3–5 flags — the ones that genuinely need judgment.

Cycle 10. A year in. Material variances, post-period entries, and structural changes flag. The noise has been resolved into candidates. A reviewer on cycle 10 reviews exceptions; a reviewer on cycle 1 was teaching the deployment what exceptions are.

The agent does not learn this by training on data. It accumulates cycle-level context within the deployment — the reviewer’s resolved-flag history, the workflow definition’s evolution, the candidate-explanation patterns that have been accepted enough times to become defaults. The bet is that the cycle-10 reviewer trusts the agent because they remember teaching it on cycle 1 — and the audit trail makes that history defensible to a reviewer who arrives at cycle 14 having never met cycle 1.