Ontology — entities and the transaction taxonomy
The vocabulary. Every label the product uses for a thing or an event is defined here, with the code symbol that implements it. Status keys per
README.md: ✅ grounded · ⚠️ VALIDATE · 🔲 TODO.SME-reviewed 2026-06-18 (rounds 1–3). Thresholds, severities, and the taxonomy are signed off. The high-impact ontology items (non-income detection, credit-card treatment, loan stacking, paycheck-to-paycheck, new obligation/channel types) are implemented; a few refinements are queued — see
validation-checklist.md→ Round 3.
1. Entities and relationships
Borrower (a person/business being underwritten)
└── Case (one underwriting pull for that borrower)
└── Statement ≈ one Account over one Period
└── Transaction (one debit or credit, on a Date, with a running Balance)
└── Counterparty (the other side: payer or payee)| Entity | Definition | Code |
|---|---|---|
| Borrower | The party whose repayment capacity is being assessed. Identity is asserted by the lender and cross-checked against statement account-holder names. | Borrower |
| Case | One underwriting pull: the set of statements consolidated into a single decision view for a borrower. | Case |
| Statement | One bank account’s activity over a continuous period: header (bank, account, holder, period, opening/closing balance) + an ordered ledger of transactions. | StatementMeta |
| Account | A single bank account, identified (masked) by number. A borrower may hold several. | account_number_masked |
| Transaction | One posting: date, narration, debit XOR credit amount, resulting balance, derived channel/category/counterparty/flags. | Transaction |
| Counterparty | The other side of a transaction, derived from the narration (a salary employer, a lender, a merchant, the borrower’s own other account). | Counterparty |
Invariant (the trust gate, ✅): a bank statement is self-verifying — for every row,
previous_balance + credit − debit = balance. The whole chain plus the header endpoints must
reconcile, or the data is not trusted (verify.py). This is the bedrock the rest stands on.
2. Transaction direction ✅
| Term | Definition |
|---|---|
| Credit | Money into the account (inflow). Candidate income, transfer-in, refund. |
| Debit | Money out of the account (outflow). Candidate obligation, spend, transfer-out. |
3. Channel / payment rail ✅ (NPCI / RBI rails)
How money moved. Grounded in India’s actual payment rails. Code: TxnMode, taxonomy.CHANNEL_PATTERNS.
| Channel | Definition |
|---|---|
| UPI | Unified Payments Interface — instant retail push/pull (NPCI). |
| IMPS | Immediate Payment Service — instant interbank (NPCI). |
| NEFT | National Electronic Funds Transfer — batch interbank (RBI). |
| RTGS | Real-Time Gross Settlement — high-value real-time (RBI). Split out from NEFT ✅ — it implies a ≥₹2L transfer, so the channel itself is a signal (surfaces in the channel mix). |
| NACH / ECS | National Automated Clearing House / Electronic Clearing Service — mandate-based recurring debits (EMIs, SIPs, insurance). The presence of a NACH return is a core risk signal (§6). |
| SI | Standing-instruction / autopay / e-mandate auto-debit, when not tagged to a specific rail. |
| AEPS | Aadhaar-enabled Payment System — micro-banking / cash via Aadhaar (NPCI). |
| BBPS | Bharat Bill Payment System — interoperable bill payments (NPCI). |
| WALLET | Prepaid instrument / mobile wallet (PPI) — distinct from the UPI rail. |
| ATM | Cash withdrawal at an ATM. |
| CHEQUE | Paper instrument; a cheque return is a risk signal. |
| CASH | Cash deposit/withdrawal at branch/CDM. High cash intensity reduces traceability of income. |
| CARD / POS | Debit/credit card or point-of-sale spend. |
| OTHER | Unclassified. A high OTHER share is a data-quality signal, not a clean default. |
FASTag is intentionally not a channel — useful for profiling, not underwriting (SME C1).
4. Income types ⚠️ VALIDATE (definitions standard; detection thresholds in metrics.md)
Credits that represent the borrower’s earning capacity, vs. non-income credits that must be
excluded from income. Code: IncomeSource.kind, classified by taxonomy.classify_income_type
(INCOME_TYPE_PATTERNS) over each detected recurring-income group; the salary hint (INCOME_HINT)
plus the recurring-credit heuristic in analytics.py (thresholds in policy.py) decide what is
income. Each source’s class is surfaced in the report. ✅ classes now distinguished.
Type (kind) | Definition | Underwriting treatment |
|---|---|---|
| salary | Regular employment income (SALARY/SAL/WAGES/STIPEND). | Strongest income — stable, verifiable. |
| business | Recurring trade/professional inflows for self-employed borrowers; the default class for a recurring non-salary inflow that matches no other type. | Income, but assess regularity + seasonality. |
| rental | Recurring rent received (RENT/LEASE on the credit side). | Income; verify with agreement. |
| interest | Bank/FD interest, dividends (interest payout, not principal). | Supplementary income; usually small. |
| government | DBT, pension, subsidy, PFMS, EPFO, scholarship, treasury. | Income; stable but policy-dependent. |
Funds available, but NOT income ✅ (SME 2026-06-18) — taxonomy.NON_INCOME_PATTERNS
The cardinal underwriting error is counting capital movement as earning capacity. These credits are
detected and excluded from income before any income test (classify_non_income_credit), even
when recurring. Income increases earning capacity; these merely move, borrow, redeem, or return capital.
| Category | Examples |
|---|---|
| funding | loan disbursals, overdraft drawdowns, credit-line utilisation, BNPL credit |
| asset_conversion | FD/RD/MF/bond maturity & redemption, insurance proceeds, security-deposit refunds |
| refund | merchant refunds, chargebacks, cashback, failed-transaction reversals |
| reimbursement | travel/fuel/medical/expense reimbursements (balance-sheet neutral) |
| transfer | self-transfers, own-account sweeps, wallet-to-bank, P2P top-ups |
| exceptional | gifts, inheritance, marriage gifts, crowdfunding |
Core vs supplementary income ✅ (A1, SME 2026-06-18) — IncomeSource.tier
Affordability (FOIR, income band) is assessed on core income only; supplementary income supports the assessment but must not drive it — converting a one-off bonus into ongoing capacity is a classic error.
| Tier | Includes | Drives FOIR? |
|---|---|---|
| core | salary, pension/government, stable (monthly) business, stable rental | Yes — core_monthly_income is the FOIR denominator. |
| supplementary | bonus, incentive, overtime, commission, arrears; interest; unstable business/rental | No — reported as supplementary_monthly_income. |
One-off bonuses/arrears aren’t annualised — a single non-recurring credit isn’t treated as monthly income at all. Business-income confidence (A4) is approximated by stability: unstable business income → supplementary.
5. Obligation types ⚠️ VALIDATE
Recurring debits that commit future income — the denominator’s numerator in FOIR (§ metrics.md).
Code: Obligation.type, taxonomy.OBLIGATION_PATTERNS. Now modelled: the type is detected by
the taxonomy; whether it loads FOIR is a policy.py toggle (counts_toward_foir), surfaced on each
obligation as Obligation.counts_toward_foir and shown in the report (“not in FOIR” when excluded).
| Type | Definition | Counts toward FOIR? (default) |
|---|---|---|
| EMI / loan | Equated monthly instalment on an existing loan. | Yes — the core obligation; not a toggle. |
| BNPL · gold loan · LAP · microfinance · payday · salary advance · OD interest | Other forms of credit servicing (SME B4). | Yes — all are real debt; not toggles. |
| Rent | Recurring rent paid. | Yes by default (foir_count_rent) — a real fixed outflow; lenders who treat it as non-credit can toggle it off. |
| Insurance premium | LIC / health / general insurance. | Yes by default (foir_count_insurance) — committed outflow. |
| SIP / investment | Recurring mutual-fund/RD contribution. | No by default (foir_count_sip) — discretionary, can be paused. |
| Utility | Electricity, gas, broadband, telecom. | No by default (foir_count_utility) — variable living cost, not credit. |
| Subscription | OTT, SaaS, memberships. | No by default (foir_count_subscription). |
| Tax | Recurring tax outflow. | No by default (foir_count_tax) — context-dependent. |
| Other | Unclassified recurring debit. | Conditional (foir_count_other) — counts only when it persists ≥ foir_other_min_months (default 3); per SME, don’t auto-count every recurring debit > ₹1,000. |
The “counts toward FOIR” column is the crux of underwriting correctness and is lender policy, not universal fact — so each ambiguous type is an explicit, overridable toggle (
tenant.scoring_policy), never buried in a regex. Defaults above are ✅ SME-reviewed (2026-06-18).Credit-card bill payments are NOT obligations (SME B3,
taxonomy.CREDIT_CARD_PAYMENT): the payment amount is spend routing, not debt — counting it would double-count expenditure. Revolving behaviour (carried balance) is read instead from card finance/late charges →revolving_creditflag (§6). Detection method: an obligation is a recurring same-payee debit (≥ 2 months, median ≥ ₹1,000). Fixed vs variable (B1): amount CV ≤obligation_fixed_max_cv(0.25) is fixed; a named credit obligation may vary up toobligation_variable_max_cv(0.60) and still count, flaggedvariable(floating EMI); a variable non-credit debit is lumpy spend, not an obligation. Amortisation (B2): non-monthly obligations (quarterly/annual) load FOIR at theirmonthly_equivalent(amount ÷ months-between-occurrences), so paying annually doesn’t look stronger.
6. Risk-signal types ✅ SME-reviewed 2026-06-18 (definitions grounded; severities in risk.md)
Events that indicate elevated credit risk. Code: RiskFlag.type; patterns in
taxonomy.{BOUNCE,PENAL,CHEQUE_RETURN,CASH_DEPOSIT}. Severities are tuned to limit false
positives: payment dishonours are recency-weighted (≤ 6 months full High, 7–12 partial Medium,
older audit-only); negative-balance days are tiered (1–3 low / 4–10 medium / >10 high); cash-deposit
intensity is borrower-segment dependent (salaried high > 25% of credits, self-employed medium
40%); circular transfers escalate to high when repeated across ≥ 3 months. Only repeated payment dishonours auto-decline (risk.md §2); other signals accumulate.
| Signal | Definition | Why it matters |
|---|---|---|
| NACH / ECS bounce | A mandate-based recurring debit (often an EMI) that failed for insufficient funds. | Direct evidence of a missed/late obligation — the single strongest behavioural red flag. |
| Cheque return | A cheque that bounced. | Same family; payment dishonour. |
| Penal / late charge | A penalty the bank levied (return charge, late fee). | Confirms a dishonour/late event from the bank’s side. |
| Negative-balance days | Days the account was overdrawn. | Liquidity stress. |
| High cash intensity | Large share of income arriving as cash deposits. | Income is hard to verify; AML/round-tripping concern. |
| Circular / round-tripping | Funds cycling A→B→A to inflate apparent turnover. | Manufactured cash flow; fraud signal. |
| Sudden large one-off credit | An outsized inflow inconsistent with the income pattern; escalates to High if ≥ 80% exits within 7 days (pass-through) or the source is untraceable. | May be a loan/borrowing dressed as income, or a pass-through; verify source. |
| Inactive / stale income | No qualifying income credit within the recency window (salaried 90 / self-employed 120 days). | Historical average income may no longer reflect current capacity. |
| Loan stacking | Multiple concurrent loans / lenders, or multiple fresh disbursals in the window (loan_stacking). | Concurrent debt is often more predictive than any single obligation’s size; early debt-cycling signal. |
| Paycheck-to-paycheck | Balance falls below 10% of monthly income within 7 days of income, repeatedly (liquidity_stress). | One of the most predictive liquidity-stress indicators. |
| Revolving credit-card | Card finance / late-payment charges → a carried (revolving) balance (revolving_credit). | Revolving usage, not gross card spend, is the real debt signal (SME B3). |
| Service-failure charges | Mandate/SI failure, minimum-balance, returned-payment fees (service_charges). | Individually weak; repeated occurrences indicate distress. |
| Speculative activity | Gambling/betting/crypto concentration (speculative_activity). | Cash-flow volatility / unstable finances (underwriting view; AML is separate). |
| Joint / co-applicant account | Account holder name indicates joint ownership (joint_account). | Income attribution to the sole borrower is uncertain — don’t assume 100%. |
7. Where this lives in code today
The taxonomy has been lifted into server/src/domain/ (the plan this section used to track). The
regex/threshold is the ontology entry now; what remains is genuine domain depth, not plumbing.
| Ontology concept | Code (source of truth) | Remaining |
|---|---|---|
| Channels | taxonomy.CHANNEL_PATTERNS | ✅ lifted · RTGS split; AePS/BBPS/SI/wallet added |
| Spend categories | taxonomy.CATEGORY_PATTERNS | ✅ lifted (presentation-only) |
| Obligation types | taxonomy.OBLIGATION_PATTERNS + policy.counts_toward_foir | ✅ lifted · BNPL/gold/LAP/etc. added; FOIR-inclusion configurable; CC excluded |
| Income | taxonomy.{INCOME_HINT,INCOME_TYPE_PATTERNS,NON_INCOME_PATTERNS} + analytics.py heuristic | ✅ classified; non-income credits excluded. ⚠️ core/supplementary split queued (A1) |
| Risk signals | taxonomy.{BOUNCE,PENAL,CHEQUE_RETURN,CASH_DEPOSIT,CARD_FINANCE_CHARGE,SERVICE_CHARGE,LOAN_DISBURSAL} | ✅ lifted · recency-weighted; loan-stacking / paycheck / revolving / service-charge signals added |
| Types/labels | report.py (TxnMode, Obligation.type, IncomeSource.kind, RiskFlag) | labels defined here (this doc) |
The high-impact ontology work is done. What remains is the SME’s lower-priority refinements
(core/supplementary income A1, business-confidence A4, fixed/variable obligations B1, non-monthly
amortisation B2, gambling/crypto D4, joint-account attribution E1 — see validation-checklist Round 3)
and ongoing validation: promote ⚠️ defaults to ✅ grounded with a cited source in
references.md. Lenders can already override any threshold in-app
(Settings → Risk scoring policy) without waiting on that.