Domain knowledge base & ontology
The authority the product is built on. Bank-statement analysis and cash-flow underwriting are a real discipline with real definitions; this is where we write them down so the system is correct by the domain, not by what happens to look plausible in code.
Why this exists
The product already makes domain decisions — what counts as income, how FOIR is computed, which
flags mean “risk,” what score a borrower gets. Today those decisions live as scattered constants
and regexes (scoring.py FOIR bands, analytics.py category/obligation/channel tables, income
heuristics). That is the failure mode this knowledge base prevents: rules that look reasonable
but aren’t grounded in the domain, can’t be reviewed by a credit person, and drift.
The contract: docs define, code derives
docs/domain/is the authority. Every domain concept (an entity, a transaction category, a metric formula, a threshold, a flag severity) is defined here, with its basis (a cited source, a stated convention, or an explicit assumption).server/src/domain/is the single source of truth in code. Two modules now hold it:policy.py— the scoring rubric and the tunable analysis thresholds (income detection, and which obligation types load FOIR) as a typed, documented, per-lender-configurableScoringPolicy.scoring.pyandanalytics.pyboth read from it (no inline magic numbers); a lender overrides any field viatenant.scoring_policy— exposed in the app at Settings → Risk scoring policy (owner-only), or thePATCH /v1/org/scoring-policyAPI.taxonomy.py— the transaction taxonomy lifted out of the pipeline: channel/category/ obligation/income/risk patterns as one table set, each citing its ontology section.analytics.pyimports it, so the regex is the ontology entry (the gap that ontology §7 flagged is closed).
- Every domain constant in code cites its definition here, and every definition here names the code symbol that implements it. They are reviewed together.
- Golden tests are domain-conformance tests — they pin the behaviour the ontology specifies, so a rubric change is a conscious, reviewed edit to both the doc and the test.
Grounding status (used throughout)
- ✅ grounded — backed by a named authority (RBI, NPCI, DPDP Act, a documented lender convention).
- ⚠️ VALIDATE — our current assumption (often a threshold) that needs sign-off from a credit/ underwriting expert or a target lender before we treat it as settled. Honesty over false precision.
- 🔲 TODO — not yet defined.
Map
| File | Covers |
|---|---|
ontology.md | Entities (Borrower, Case, Statement/Account, Transaction, Counterparty) and the transaction taxonomy — direction, channel, income type, obligation type, risk-signal type. |
metrics.md | The underwriting metrics and their formulas: income (types, regularity, stability), balances (ADB/AMB/min/negative-days), cash flow, obligations, FOIR / DBR. |
risk.md | Risk signals (what each means and why it matters to a lender), severity rationale, and the deterministic scoring rubric with its thresholds (operationalized + configurable in src/domain/policy.py). |
decision.md | Layers 3–4: eligibility (supportable EMI, loan sizing) and the decision policy (approve / conditions / counter-offer / refer / decline), per-product, auditable. |
references.md | Sources: RBI circulars, NPCI rails, DPDP Act, Account Aggregator (Sahamati), and any lender conventions we anchor to. |
validation-checklist.md | The open ⚠️ VALIDATE items (thresholds + severities) with their current defaults — the agenda for an SME / target-lender sign-off. |
How a domain expert reads this
Start with ontology.md (the vocabulary), then metrics.md (how the numbers are defined), then
risk.md (how they become a verdict). Anything marked ⚠️ VALIDATE is an open question for you.