Standard DocTypes
Regular documents like Sales Invoice, Customer, Supplier. Full CRUD with lifecycle hooks.
ERPNext is a 20-year-old open-source ERP built on the Frappe framework. Understanding its architecture explains why legacy modernization is hard — and why structured specifications like ModernizeSpec are necessary.
Founder: Rushabh Mehta — self-taught FOSS developer whose family business suffered a failed ERP implementation. That experience led to ERPNext: an ERP that small businesses could actually use.
Company: Frappe Technologies Pvt. Ltd., incorporated July 2008 in Mumbai, India.
Vision: Become the “WordPress of ERP” — making enterprise resource planning accessible to millions of small businesses worldwide. 100% open source under GPL-3.0 since 2009, with no paywall for “enterprise” features.
| Metric | Value |
|---|---|
| GitHub stars | ~31,400 |
| Forks | ~10,400 |
| Contributors | 903+ (repo), 2,267+ (ecosystem) |
| Total commits | 55,600+ |
| Revenue (FY2025) | ~$3.9M USD |
| Revenue CAGR | 48% |
| Dimension | Count |
|---|---|
| Python files | 2,532 |
| Python lines of code | 316,679 |
| JavaScript files | 626 |
| JavaScript lines of code | 73,932 |
| Python function definitions | 11,392 |
| Whitelisted API endpoints | 768 |
| Unique doctypes | 521 |
| Test files | 362 |
| Patch/migration files | 405 |
| Modules | 21 |
Every business entity in ERPNext is a DocType — simultaneously defining data model, UI layout, API endpoints, and behavior.
Standard DocTypes
Regular documents like Sales Invoice, Customer, Supplier. Full CRUD with lifecycle hooks.
Child Table DocTypes
Embedded tables like Invoice Items, Address lines. Nested within parent documents.
Single DocTypes
Singleton settings like Company Settings, Global Defaults. One record per site.
Submittable DocTypes
Draft/Submit/Cancel workflow like Journal Entry, Sales Invoice. State machine with accounting implications.
DocType definitions are JSON files in the codebase. CRUD operations, form layouts, list views, validations, and REST APIs are auto-generated from metadata. With 521 doctypes, ERPNext exposes 521+ resource API endpoints automatically.
+----------------------------------------------------------+| ERPNext Application || (30+ modules: Accounting, HR, Manufacturing, CRM, ...) |+----------------------------------------------------------+| Frappe Framework || (Full-stack: ORM, REST API, UI gen, background jobs) |+----------------------------------------------------------+| Python 3.14 | MariaDB/Postgres | Redis x3 | Node.js 24 || Gunicorn | Nginx | RQ Workers| Socket.IO |+----------------------------------------------------------+This is the single most important architectural constraint for migration. The full chain is deeper than it first appears:
frappe.model.document.Document +-- StatusUpdater (status workflow transitions) +-- TransactionBase (posting date, UOM, naming validation) +-- SubcontractingController (vendor subcontracting ops) | +-- BuyingController (1,271 lines) | | +-- PurchaseOrder | | +-- PurchaseInvoice | | +-- PurchaseReceipt | +-- SellingController (1,075 lines) | +-- SalesOrder | +-- SalesInvoice | +-- DeliveryNote | +-- AccountsController (4,412 lines, 168 functions) | +-- PaymentEntry (3,559 lines) | +-- JournalEntry | +-- StockController (2,380 lines, 142 functions) +-- StockEntry (4,149 lines) +-- StockReconciliation5-6 levels of inheritance before you reach a concrete doctype. AccountsController is the single most complex file in the codebase — every purchase order, sales invoice, payment entry, and stock entry flows through it. A single method change has blast radius across the entire application.
The shared controllers/ directory contains 23,212 lines across 18 Python files. These form the inheritance chain that all transaction doctypes depend on.
From a DDD perspective, this inheritance chain conflates two concerns:
The result is that AccountsController is simultaneously a domain service (knows about taxes, GL entries, pricing rules) and an infrastructure base class (every transaction inherits from it). In a modernized architecture, these decompose into 4-5 independent services with explicit interfaces.
Every financial transaction follows a hidden multi-step pipeline, assembled at runtime through inheritance and hooks:
+-------------------------------------------------------------------+| 1. VALIDATION || validate() hooks on Document → TransactionBase → Controller |+-------------------------------------------------------------------+ | v+-------------------------------------------------------------------+| 2. PRICING RULES || apply_pricing_rule_on_transaction() || (dynamic discounts, rate overrides, margin calculations) |+-------------------------------------------------------------------+ | v+-------------------------------------------------------------------+| 3. TAXES & TOTALS || taxes_and_totals.py — 5 charge types, cascading, per-item || (2,800 lines of calculation logic) |+-------------------------------------------------------------------+ | v+-------------------------------------------------------------------+| 4. ACCOUNTING || Auto-generate GL Entries via general_ledger.py || (debit/credit pairs, multi-currency, budget checks) |+-------------------------------------------------------------------+ | v+-------------------------------------------------------------------+| 5. STOCK || Update Stock Ledger via stock_ledger.py (2,439 lines) || (valuation, serial/batch tracking, warehouse transfers) |+-------------------------------------------------------------------+ | v+-------------------------------------------------------------------+| 6. STATUS WORKFLOW || StatusUpdater → Draft/Submitted/Cancelled state machine |+-------------------------------------------------------------------+None of these steps are visible from reading a single doctype file. A SalesInvoice.on_submit() triggers this entire chain through method inheritance and hook dispatch. This is why structured context (ModernizeSpec’s extraction-plan.json) matters — AI agents need to know the full pipeline, not just the file they’re reading.
Every module that processes transactions depends on Accounts (for GL entries) and often Stock (for inventory). The actual coupling is:
+----------+ | Accounts | <---- Every transaction module +----+-----+ ^ | +---------+---------+ | | |+---+---+ +--+----+ +--+-----+| Buying| |Selling| | Stock |+---+---+ +--+----+ +--+-----+ | | | +----+----+ +----+ | | +----+-----+ +-----+----------+ |Manufacturing| | Subcontracting| +-----------+ +---------------+
setup <---- all modules (Company, Currency, Fiscal Year)utilities <---- all modules (naming, validation, regional)Accounts is upstream of everything. This is why ModernizeSpec’s extraction plan starts with Core Accounting (Phase 1) — you cannot migrate downstream modules until the GL entry interface is stable.
hooks.py (686 lines) is the central registry that wires the entire application:
validate, on_submit, on_cancel)This makes execution paths implicit rather than explicit. When a Sales Invoice is submitted, hooks trigger stock updates, accounting entries, notifications, and regional compliance checks across multiple files — none of which are visible from reading sales_invoice.py alone.
| Module | Python Files | Doctype JSONs | Relative Weight |
|---|---|---|---|
| Accounts | 677 | 292 | 43% of all doctypes |
| Stock | 335 | 85 | 13% |
| Manufacturing | 180 | 51 | 8% |
| Setup | 128 | 55 | 8% |
| Selling | 115 | 24 | 4% |
| CRM | 97 | 27 | 4% |
| Buying | 93 | 23 | 3% |
| Assets | 79 | 28 | 4% |
| Projects | 62 | 17 | 3% |
Accounts dominates: 43% of all doctype definitions and 27% of all Python files. Any migration effort must address Accounts first or risk cascading issues across the entire system.
| File | Lines | Type |
|---|---|---|
test_purchase_receipt.py | 5,284 | Test |
test_sales_invoice.py | 5,068 | Test |
accounts_controller.py | 4,412 | Controller |
test_work_order.py | 4,216 | Test |
stock_entry.py | 4,149 | DocType |
test_tax_withholding_category.py | 4,021 | Test |
payment_entry.py | 3,559 | DocType |
serial_and_batch_bundle.py | 3,285 | DocType |
test_purchase_invoice.py | 3,234 | Test |
sales_invoice.py | 3,167 | DocType |
The largest test files cluster around critical doctypes, confirming that the most business-critical entities are also the most complex.
| Tier | Modules | Doctypes | Effort Multiplier | Reason |
|---|---|---|---|---|
| Tier 1 (Core) | Accounts, Controllers | ~292 | 3x | Deep inheritance, implicit paths, regional overrides |
| Tier 2 (Transaction) | Stock, Selling, Buying | ~132 | 2x | Depends on Tier 1 controllers, complex business logic |
| Tier 3 (Domain) | Manufacturing, CRM, Projects, Assets | ~123 | 1.5x | Domain-specific but less interconnected |
| Tier 4 (Utility) | Support, Quality, Setup, Maintenance | ~90 | 1x | Relatively self-contained |
| Tier 5 (Industry) | Education, Healthcare, Agriculture | Varies | 1x | Could be deferred entirely |
ERPNext was not designed with Domain-Driven Design, but DDD concepts map onto its structure — some naturally, some as anti-patterns:
| DDD Concept | ERPNext Equivalent | Quality |
|---|---|---|
| Bounded Contexts | 21 modules (Accounts, Stock, Selling…) | Good boundaries, but cross-module calls bypass them |
| Aggregates | Document + child tables (Invoice + Items) | Natural fit — Frappe enforces parent-child integrity |
| Aggregate Roots | DocTypes with lifecycle (Submit/Cancel) | Present but implicit — no explicit root enforcement |
| Value Objects | Currency, UOM, Address components | Missing — all data is mutable Document fields |
| Domain Events | hooks.py document events | Anti-pattern: events are implicit, registered globally |
| Repositories | frappe.get_doc(), frappe.get_list() | Implicit — ORM is tightly coupled, not injectable |
| Domain Services | Controller inheritance chain | Anti-pattern: services are base classes, not composable |
| Anti-Corruption Layer | allow_regional() decorator | Partial — regional overrides are the only ACL pattern |
| Context Map | Module modules.txt + imports | Missing — no explicit upstream/downstream contracts |
The good news: ERPNext’s module structure provides natural bounded context boundaries. Accounts, Stock, Selling, and Manufacturing are clearly delineated in the filesystem.
The challenge: Domain logic is trapped inside an inheritance hierarchy rather than expressed as composable services. Extracting the Tax Calculator into Go required identifying that the “service” was actually methods scattered across AccountsController, taxes_and_totals.py, and regional override hooks — not a single class.
This is the pattern ModernizeSpec’s domains.json captures: mapping legacy code organization to DDD bounded contexts, including coupling scores that reveal how entangled the modules really are.
| Factor | Impact |
|---|---|
| Frappe framework coupling | Every doctype depends on Frappe ORM, permissions, naming, workflow |
| Controller inheritance | 23,212 lines of shared logic; all transactions flow through AccountsController |
| Implicit execution paths | hooks.py makes call chains invisible across files |
| Factor | Impact |
|---|---|
| Regional overrides | Country-specific tax and compliance scattered across hooks |
| Auto-generated APIs | 700+ endpoints emerge from metadata, not explicit code |
| Multi-tenancy | Bench/site model embedded in framework, not application code |
| Factor | Impact |
|---|---|
| 462 patches | Decade of schema evolution must be understood for data model correctness |
| Dynamic typing | Python’s lack of static types makes automated analysis harder |
| Test coverage gaps | 7:1 source-to-test ratio; many paths untested |
| JavaScript frontend | 73,932 lines using jQuery + Frappe UI |
| Factor | Benefit |
|---|---|
| DocType JSON schemas | Machine-readable data model definitions — ideal for automated extraction |
| Module organization | 21 clear modules with explicit boundaries in modules.txt |
| Lifecycle hooks | Predictable method names (validate, on_submit, on_cancel) |
| Open source | Full access to every line of code and commit history |
| Large test suite | 362 test files provide behavioral specifications |
| Well-documented API | @frappe.whitelist annotations mark all public endpoints |
ERPNext is a representative example of the legacy modernization challenge. Every concept in its complexity profile maps to a ModernizeSpec specification file:
| ERPNext Concept | ModernizeSpec File |
|---|---|
| 521 doctypes across 21 modules | domains.json — bounded context inventory |
| AccountsController God-class | complexity.json — hotspot identification |
| Controller inheritance chain | complexity.json — coupling scores |
| Tier 1-5 classification | extraction-plan.json — phase sequencing |
| 68 parity tests | parity-tests.json — behavior preservation |
| Migration progress tracking | migration-state.json — progress dashboard |
The specification was extracted from this exact analysis. Every schema field exists because ERPNext’s migration needed it.