Last updated: 2026-05-21

What the classifier does

Every document you upload runs through a Haiku 4.5 classifier (the cheap fast Anthropic model). One call, one JSON response — the output is a document type, a confidence rating, and a bag of type-appropriate metadata.

Recognised types

  • inspection_report — pre/post-tenancy inspections, periodic check-ups.
  • ber_cert — SEAI Building Energy Rating certs. Triggers compliance auto-extract.
  • rtb_registration — Residential Tenancies Board registration confirmations. Triggers compliance auto-extract.
  • gas_safety_cert — annual landlord gas safety inspection. Triggers compliance auto-extract.
  • electrical_safety_cert — electrical periodic inspection / safety certificate. Triggers compliance auto-extract.
  • insurance — landlord / property insurance policies.
  • lease, letting_agreement — tenancy agreements. Tenants, term, rent, deposit, RTB number all extracted.
  • invoice — bills owed (vs. paid).
  • lpt_receipt — Revenue Local Property Tax confirmations.
  • receipt — paid receipts. Vendor, total, VAT, expense category all extracted.
  • statement — monthly agency statements (XLSX / PDF). Triggers a separate dedicated parser.
  • photo — non-text images.
  • other — anything that doesn't fit the above.

What the classifier returns

{
  "document_type": "lease",
  "confidence": "high",
  "extracted_metadata": {
    "tenant_names": ["Aoife Brennan", "Conor Lynch"],
    "term_start": "2026-03-01",
    "term_end": "2032-03-01",
    "monthly_rent": 1850,
    "deposit": 1850,
    "lease_type": "fixed"
  }
}

Extracted metadata is stored as JSON on the document row and surfaced in the property's documents tab. Anchorlet doesn't try to be clever — if a field isn't visible in the text, it's omitted (not guessed).

Confidence levels

  • high — clear standard format, all required fields visible.
  • medium — likely, but with ambiguity (e.g. invoice that could be maintenance or compliance).
  • low — guessed from sparse text. Typically lands in Needs review.

When the classifier triggers downstream extractors

For statements + the four compliance certs, classification is just the gate — once the classifier tags a file, a heavier Opus 4.7 extractor runs to pull the structured fields (period dates, rent totals, expiry dates, etc.). See:

Re-running on a corrected file

If the classifier got it wrong:

  1. Open the document's detail page.
  2. Edit to change the type manually. The classifier's guess is overridden.
  3. If the file itself was the problem (corrupted scan, bad export), upload a new version — the dedupe check is content-hashed, so a corrected scan with even one byte different is a fresh upload.