Last updated: 2026-05-21
What the classifier does
Every document you upload runs through a Haiku 4.5 classifier (the cheap fast Anthropic model). One call, one JSON response — the output is a document type, a confidence rating, and a bag of type-appropriate metadata.
Recognised types
- inspection_report — pre/post-tenancy inspections, periodic check-ups.
- ber_cert — SEAI Building Energy Rating certs. Triggers compliance auto-extract.
- rtb_registration — Residential Tenancies Board registration confirmations. Triggers compliance auto-extract.
- gas_safety_cert — annual landlord gas safety inspection. Triggers compliance auto-extract.
- electrical_safety_cert — electrical periodic inspection / safety certificate. Triggers compliance auto-extract.
- insurance — landlord / property insurance policies.
- lease, letting_agreement — tenancy agreements. Tenants, term, rent, deposit, RTB number all extracted.
- invoice — bills owed (vs. paid).
- lpt_receipt — Revenue Local Property Tax confirmations.
- receipt — paid receipts. Vendor, total, VAT, expense category all extracted.
- statement — monthly agency statements (XLSX / PDF). Triggers a separate dedicated parser.
- photo — non-text images.
- other — anything that doesn't fit the above.
What the classifier returns
{
"document_type": "lease",
"confidence": "high",
"extracted_metadata": {
"tenant_names": ["Aoife Brennan", "Conor Lynch"],
"term_start": "2026-03-01",
"term_end": "2032-03-01",
"monthly_rent": 1850,
"deposit": 1850,
"lease_type": "fixed"
}
}
Extracted metadata is stored as JSON on the document row and surfaced in the property's documents tab. Anchorlet doesn't try to be clever — if a field isn't visible in the text, it's omitted (not guessed).
Confidence levels
- high — clear standard format, all required fields visible.
- medium — likely, but with ambiguity (e.g. invoice that could be maintenance or compliance).
- low — guessed from sparse text. Typically lands in Needs review.
When the classifier triggers downstream extractors
For statements + the four compliance certs, classification is just the gate — once the classifier tags a file, a heavier Opus 4.7 extractor runs to pull the structured fields (period dates, rent totals, expiry dates, etc.). See:
Re-running on a corrected file
If the classifier got it wrong:
- Open the document's detail page.
- Edit to change the type manually. The classifier's guess is overridden.
- If the file itself was the problem (corrupted scan, bad export), upload a new version — the dedupe check is content-hashed, so a corrected scan with even one byte different is a fresh upload.