Last updated: 2026-04-25

What auto-extraction does

When you upload a statement, Anchorlet doesn't store the file as-is. It parses the contents into structured rows so you can search, sort, and run reports across months and properties.

What gets extracted

For every statement:

Agency name (e.g. "Ray Cooke Auctioneers")
Period start + end (the month the statement covers)
Totals — rent received, management fee, expenses, net to landlord, carried balance

For every per-property row:

Property address as written by the agency (the "raw" string)
Invoice number + date
Rent received, management fee, expenses deducted, balance transferred, property balance
Match confidence — how sure Anchorlet is that the row maps to a specific property in your workspace

For every expense line:

Expense date, description, raw property address, net / VAT / total amounts, invoice id

XLSX vs PDF

XLSX uses a deterministic parser. Fast (under a second) and exact, but only works for known formats.
PDF uses an LLM (Claude Opus 4.7). Slower (10–30 seconds) but handles any layout. The LLM is given a strict JSON schema to follow, so the output shape is identical to the XLSX path.

Both paths land in the same statements + statement_entries + statement_expense_items tables — the rest of the app doesn't need to know which way it came in.

What doesn't get extracted

Free-text notes the agency wrote in cells (these are kept as raw text but not indexed).
Diagrams, scans, or images embedded in PDFs — only the text layer is read.
Anything outside the standard rent/fee/expense grid. If your agency includes deposits, mortgage payments, or capital expenditures inline, those land in the raw_extract blob but aren't surfaced in the totals.

Reconciliation

After ingest, Anchorlet checks that the per-row totals add up to the statement-level totals. If they don't, you'll see a Reconciliation mismatch banner on the statement detail page — usually a sign the agency's spreadsheet had a manual override or a row that doesn't fit the standard shape. The data is still there; it's just flagging the discrepancy.

Cost + telemetry

Each PDF extraction is a single Opus 4.7 call. Token cost is logged to usage_logs and visible at Settings → Usage. XLSX parses are local and free.