Domain knowledge base & ontology

The authority the product is built on. Bank-statement analysis and cash-flow underwriting are a real discipline with real definitions; this is where we write them down so the system is correct by the domain, not by what happens to look plausible in code.

Why this exists

The product already makes domain decisions — what counts as income, how FOIR is computed, which flags mean “risk,” what score a borrower gets. Today those decisions live as scattered constants and regexes (scoring.py FOIR bands, analytics.py category/obligation/channel tables, income heuristics). That is the failure mode this knowledge base prevents: rules that look reasonable but aren’t grounded in the domain, can’t be reviewed by a credit person, and drift.

The contract: docs define, code derives

docs/domain/ is the authority. Every domain concept (an entity, a transaction category, a metric formula, a threshold, a flag severity) is defined here, with its basis (a cited source, a stated convention, or an explicit assumption).
server/src/domain/ is the single source of truth in code. Two modules now hold it:
- policy.py — the scoring rubric and the tunable analysis thresholds (income detection, and which obligation types load FOIR) as a typed, documented, per-lender-configurable ScoringPolicy. scoring.py and analytics.py both read from it (no inline magic numbers); a lender overrides any field via tenant.scoring_policy — exposed in the app at Settings → Risk scoring policy (owner-only), or the PATCH /v1/org/scoring-policy API.
- taxonomy.py — the transaction taxonomy lifted out of the pipeline: channel/category/ obligation/income/risk patterns as one table set, each citing its ontology section. analytics.py imports it, so the regex is the ontology entry (the gap that ontology §7 flagged is closed).
Every domain constant in code cites its definition here, and every definition here names the code symbol that implements it. They are reviewed together.
Golden tests are domain-conformance tests — they pin the behaviour the ontology specifies, so a rubric change is a conscious, reviewed edit to both the doc and the test.

Grounding status (used throughout)

✅ grounded — backed by a named authority (RBI, NPCI, DPDP Act, a documented lender convention).
⚠️ VALIDATE — our current assumption (often a threshold) that needs sign-off from a credit/ underwriting expert or a target lender before we treat it as settled. Honesty over false precision.
🔲 TODO — not yet defined.

Map

File	Covers
`ontology.md`	Entities (Borrower, Case, Statement/Account, Transaction, Counterparty) and the transaction taxonomy — direction, channel, income type, obligation type, risk-signal type.
`metrics.md`	The underwriting metrics and their formulas: income (types, regularity, stability), balances (ADB/AMB/min/negative-days), cash flow, obligations, FOIR / DBR.
`risk.md`	Risk signals (what each means and why it matters to a lender), severity rationale, and the deterministic scoring rubric with its thresholds (operationalized + configurable in `src/domain/policy.py`).
`decision.md`	Layers 3–4: eligibility (supportable EMI, loan sizing) and the decision policy (approve / conditions / counter-offer / refer / decline), per-product, auditable.
`references.md`	Sources: RBI circulars, NPCI rails, DPDP Act, Account Aggregator (Sahamati), and any lender conventions we anchor to.
`validation-checklist.md`	The open ⚠️ VALIDATE items (thresholds + severities) with their current defaults — the agenda for an SME / target-lender sign-off.

How a domain expert reads this

Start with ontology.md (the vocabulary), then metrics.md (how the numbers are defined), then risk.md (how they become a verdict). Anything marked ⚠️ VALIDATE is an open question for you.