Feature List — Bank Statement Analyzer
Product-perspective inventory with build status. Companion to
bank-statement-analyzer.md(product brief) andreport-spec.md(report contract). For the forward view (what to build next and why), seeroadmap.md.Status: ✅ built (and verified where testable) · 🟡 partial / built-but-unvalidated · 🔲 not started Priority: P0 = needed for the first paying lender · P1 = needed to scale them · P2 = later
1. Acquisition (obsrv.in)
| Feature | Status | Priority | Notes |
|---|---|---|---|
| Landing page (ledger design, lender wedge) | ✅ | P0 | live-ready; founder TODOs below |
/report — public sample report (proof asset) | ✅ | P0 | server-rendered, shareable |
| Waitlist capture + auto-confirmation email | ✅ | P0 | via existing form infra (Zoho SMTP) |
| Institutional credibility band (no personal brand) | ✅ | P0 | registered LLP, India processing, verifiable — went fully institutional |
| Pricing section (₹5/page) | ✅ | P1 | on the landing page; model settled — credit packs (no per-borrower bundle), see pricing.md |
| Borrower-view / Cases section | ✅ | P1 | ”underwrite the borrower” marketing section |
| SEO/AEO | ✅ | P1 | titles, descriptions, sitemap, robots, JSON-LD, llms.txt; OG images 🔲 |
| Content / SEO hub (guides, comparisons) | 🔲 | P2 | for organic acquisition |
2. Ingestion
| Feature | Status | Priority | Notes |
|---|---|---|---|
| Digital PDF upload (≤25MB) | ✅ | P0 | |
| Password-protected PDFs | ✅ | P0 | decrypt before extraction; password threaded upload → worker, blanked at terminal state |
| Scanned PDFs (vision path) | 🟡 | P0 | same document-block path by design; unvalidated on real scans |
| Page chunking (5pp) + parallel extraction | ✅ | P0 | order-preserving; chain re-verified |
| CSV statements | ✅ | P2 | deterministic parser, no LLM (csv_extract.py); 40 rows = 1 billing page |
| Borrower Case: multi-month + multi-account consolidation | ✅ | P1 | inter-account transfer dedup, per-account balance, consolidated FOIR; runs async on the worker. See borrower-case.md |
| Inline upload into a case | ✅ | P1 | POST /v1/cases/{id}/statements/upload — shared submit pipeline, then linked |
| Account Aggregator (consent-based pull) | 🔲 | P2 | the compliant rails, post-PMF |
3. Analysis pipeline (the engine)
| Feature | Status | Priority | Notes |
|---|---|---|---|
| LLM-first transcription (Bedrock, structured outputs) | ✅ | P0 | live-validated (Bedrock proof run); per-chunk model routing (Haiku text / Sonnet vision) |
| Balance-reconciliation trust gate | ✅ | P0 | golden-tested; re-anchoring, endpoints, dupes, date order |
| Income detection (sources, regularity, stability, trend) | ✅ | P0 | each source classified salary/business/rental/interest/government (ontology §4); detection thresholds are policy-configurable |
| Balance analytics (AMB/ADB, min/max, negative days) | ✅ | P0 | daily carry-forward series |
| Monthly cash flow table + totals/ratios | ✅ | P0 | |
| Obligations + FOIR + disposable income | ✅ | P0 | recurring-debit clustering, type classification; which types load FOIR is a per-lender policy toggle (EMI/rent/insurance counted, SIP/subscription/utility/tax excluded by default), shown per-obligation in the report |
| Counterparties (top received-from / sent-to) | ✅ | P0 | |
| Channel mix (UPI/NEFT/RTGS/IMPS/NACH/…) | ✅ | P1 | RTGS split from NEFT — the high-value (≥₹2L) rail is a signal in itself |
| Spend categories | ✅ | P1 | rule-based v1; coverage grows with real narrations |
| Risk flags: NACH/ECS bounce, cheque return, penal charges | ✅ | P0 | |
| Risk flags: negative days, high cash deposits, circular transfers, large one-off credits | ✅ | P1 | |
| Deterministic risk score + band (auditable, reproducible) | ✅ | P0 | scoring.py; golden-tested; never the LLM’s. Thresholds are a documented, per-lender-configurable policy (src/domain/policy.py, PATCH /v1/org/scoring-policy), SME-reviewed 2026-06-18: income-band-dependent FOIR, tiered flag severities, segment-aware cash thresholds, irregular income penalised (not knocked out), bands Low≥80/Med≥60. See docs/domain/ |
| Domain knowledge base + ontology | ✅ | P1 | docs/domain/ defines, src/domain/ derives: taxonomy.py (channel/category/obligation/income/risk patterns) + policy.py (scoring rubric, income thresholds, FOIR inclusion). The pipeline imports both — the regex/threshold is the ontology entry, not an ad-hoc constant |
| Eligibility + decision policy (Layers 3–4) | ✅ | P1 | src/domain/decision.py + POST /v1/statements/{id}/decision: supportable EMI & loan sizing, and the lender’s approve/conditions/counter-offer/refer/decline policy (per-product, auditable, SME-reviewed, round-5 sign-off: tightened microfinance, external KYC/fraud hard-stop, adverse-action reason codes). The engine recommends, a human decides (see override workflow below). In-app decision panel on /analyze; per-tenant policy editor in Settings. See docs/domain/decision.md |
| Override & approvals workflow (delegated authority) | ✅ | P1 | src/domain/authority.py + decision queue: refers/declines route to an approvals queue; per-user authority level (L1/L2/L3, owner=L3) gates who can override what; claim → resolve/escalate with reason codes and an append-only audit trail; KYC/fraud/tamper/verification hard-blocks are never overridable. See override-workflow.md |
| Backtest harness (calibrate the engine vs real decisions) | 🟡 | P1 | src/domain/backtest.py + server/scripts/backtest.py built and tested — manifest of decided cases → live pipeline + decision → SME’s four metrics (decision agreement, risk ranking, flag predictiveness, affordability accuracy). Awaiting lenders’ real decided-case data to run. See docs/domain/validation-checklist.md |
| Verdict narrative (LLM narrates the computed score) | ✅ | P0 | prose may vary; score does not |
| Document integrity: math/dupes/sequence checks | ✅ | P0 | |
| Tamper forensics (PDF metadata, fonts, layout) | 🔲 | P1 | premium trust signal |
| Confidence scoring + needs_review gating | ✅ | P0 | never-silently-pass enforced in worker |
| Ask-the-statement Q&A | 🔲 | P2 | wow feature, post-MVP |
4. Report & outputs
| Feature | Status | Priority | Notes |
|---|---|---|---|
| Canonical AnalysisReport contract (Pydantic ⇄ TS) | ✅ | P0 | schema export + camelCase parity verified |
| Report web UI (ledger design, full sections) | ✅ | P0 | |
| JSON via API | ✅ | P0 | |
| PDF export (credit-committee format) | ✅ | P0 | render_pdf.py; GET /v1/statements/{id}/result.pdf |
| Consolidated case PDF | 🔲 | P1 | case report is web + JSON today |
| Excel export | 🔲 | P1 |
5. Web application (app.obsrv.in)
| Feature | Status | Priority | Notes |
|---|---|---|---|
| Upload → live stage progress → report | ✅ | P0 | live at app.obsrv.in (routed platform shell) |
| Auth (email/OTP) + organizations + seats | ✅ | P0 | OTP sessions, org invites (+ absorb empty solo org on invite) |
| Statements/Documents view (history, size, storage usage, download/delete original) | ✅ | P0 | per-doc size + storage-usage header; download retained originals; refetches on every visit and polls while any upload is queued/processing, so in-progress statements show live (no waiting for one to finish before uploading the next) |
| Borrower Cases workspace (list, build, consolidate, upload, delete) | ✅ | P1 | /cases; multi-account, async with polling, inline upload, soft delete. See borrower-case.md |
| In-context guidance (section descriptions + info tooltips) | ✅ | P1 | every section explains its purpose |
| Settings: org name · retention · API key · risk scoring + decision policy | ✅ | P1 | Settings → Risk scoring policy (full ScoringPolicy) and Decision policy (per-product eligibility + approve/refer/decline rules), owner-only editors with reset — the domain configurability, self-serve |
| In-app decision panel (run a decision on a report) | ✅ | P1 | /analyze → after a report, enter amount/tenure/product → eligibility + approve/conditions/counter-offer/refer/decline with reasons; optional “send to approvals queue” |
| Approvals queue + Team authority control | ✅ | P1 | /queue (claim → resolve/escalate, hard-block caveat, append-only event log) and /team per-user authority level (L1/L2/L3); see override-workflow.md |
| Approvals CSV export | ✅ | P1 | GET /v1/queue/export.csv — the ask, borrower financials, engine call, and resolution (level/reason/note) for the lender’s audit file / LOS import; download button on /queue |
| Paginated row lists | ✅ | P1 | queue, statements, cases, team page server-side ({items, total, limit, offset}, page size 20, shared Pager) — no silent truncation, actionable queue items ordered first |
| needs_review resolve UI (flagged rows vs page image) | 🔲 | P1 | API resolve endpoint exists ✅ |
| Credit balance display + low-credit nudges | 🟡 | P1 | balance shown in top bar; nudges 🔲 |
6. API platform (integrations)
| Feature | Status | Priority | Notes |
|---|---|---|---|
| REST: submit / status / result / review-resolve | ✅ | P0 | FastAPI; auto OpenAPI docs |
| REST: borrowers / cases (create, list, add/upload, consolidate, report, delete) | ✅ | P1 | statements/cases.py; GET /v1/statements/usage for storage |
| REST: decision (eligibility + approve/refer/decline) | ✅ | P1 | POST /v1/statements/{id}/decision — per-product policy, deterministic, audited |
| REST: approvals queue + authority | ✅ | P1 | GET /v1/queue, /{id}, claim/resolve/escalate (authority-gated, reason-coded, append-only events); PATCH /v1/org/users/{email}/authority |
| Idempotency keys | ✅ | P0 | |
| Tenant API keys + auth | ✅ | P0 | hashed keys + OTP sessions; org roles |
| Rate limiting | ✅ | P1 | 30/hr per tenant |
| Webhooks (job.terminal · case.terminal) | ✅ | P1 | HMAC-signed, best-effort, both worker + serverless |
7. Billing
| Feature | Status | Priority | Notes |
|---|---|---|---|
| Pay-per-statement credits + free trial credits | ✅ | P0 | ₹5/page · CSV 40 rows/page · partial+resume · 1 trial credit |
| Credit packs + lot-based 12-mo expiry | ✅ | P0 | accounts/packs.py (₹500/₹1,000/₹2,000, +0/5/10% bonus); credit_lots (FIFO, soonest-expiry first, hourly sweep); non-refundable. See pricing.md |
| Self-serve top-ups (plug-and-play gateway) | 🟡 | P0 | infra built provider-agnostic — src/core/payments/ factory + MockGateway runs checkout→webhook→grant→receipt end-to-end; /v1/credits/{packs,checkout,webhook}; buy UI on /credits; receipt email. Pending: a real gateway’s keys (Cashfree/Razorpay/Dodo = one class + register) |
| Purchase receipt email | ✅ | P1 | receipt.html on the payment-succeeded webhook (idempotent); a send failure never voids a paid purchase |
| Per-statement cost instrumentation | ✅ | P1 | llm_usage per call (page range, attempt, frozen ₹ cost); case-consolidation spend too; scripts/token_usage.py |
| Storage accounting (per-doc size + per-tenant totals) | ✅ | P1 | Job.size_bytes; GET /v1/statements/usage (lifetime + retained) — basis for a future size/count cap |
| Per-borrower bundle pricing | ❌ | P1 | decided against (2026-06-23) — pure-consumption credit packs instead; see pricing.md |
8. Trust, security & compliance
| Feature | Status | Priority | Notes |
|---|---|---|---|
| Raw statement auto-delete on success | ✅ | P0 | delete-on-analysis default; retain-mode vault opt-in |
| Soft delete for cases & borrowers | ✅ | P1 | deleted_at; hidden everywhere, record retained, statement links freed |
| TTL sweep for failed/stalled jobs | ✅ | P0 | hourly tick (also refreshes FX, rescues stale consolidations) |
| Encryption (S3 SSE, TLS) | ✅ | P0 | code-side done; bucket policy = infra |
| Append-only audit log | ✅ | P1 | basic events wired |
| India data residency (storage ap-south-1) | ✅ | P0 | inference may transit regions — disclosure item |
| DPDP consent flow + privacy disclosures | 🔲 | P0 | legal text before real applicant data |
| VPC hardening (private DB, endpoints) | 🔲 | P0* | *prerequisite before real PII in cloud (DEPLOYMENT.md) |
| Report retention/expiry controls | 🟡 | P1 | expires_at field exists; enforcement 🔲 |
9. Platform & operations
| Feature | Status | Priority | Notes |
|---|---|---|---|
| Local dev stack (compose) + fake LLM backend | ✅ | — | zero-cost dev |
| CI (lint + tests + site build) | ✅ | — | activates on first push |
| Serverless entrypoints (Mangum API, SQS worker) | ✅ | P0 | verified under Lambda RIE |
| Alembic-owned schema | ✅ | P0 | |
| AWS deploy (CDK + OIDC workflow) | ✅ | P0 | live: Lambda (API + worker) · SQS · CloudFront api.obsrv.in · Supabase |
| Monitoring/alarms (DLQ depth, error rate, confidence drift) | 🔲 | P1 |
Scoreboard
- Shipped and live: the analysis engine + trust gate, deterministic scoring, PDF export, Bedrock extraction (live-validated), AWS deploy (Lambda/SQS/CloudFront/Supabase), auth + orgs + invites, the statements/documents view, and the Borrower Case (multi-account consolidation with inter-account dedup, async on the worker, soft delete, storage accounting, webhooks).
- Remaining to scale revenue: wire a real payment gateway (purchase infra is built — packs, lots, factory, receipt; needs a gateway’s keys), DPDP consent text + VPC hardening before real PII at volume, and tamper detection (the next trust feature).
- The shape now: the product is built and live; the pricing model is settled (credit packs) and the purchase flow runs end-to-end on the mock gateway — what’s left is a real gateway’s keys, compliance text, and the next trust feature.