Internals — How It Works¶
This chapter describes the design and implementation of NextORM for contributors and developers who want to understand the engine.
Entity metaclass (EntityMeta)¶
When Python processes a class statement like:
class Product(Entity):
name: Req[str]
price: Req[float]
tags: Set["Tag"]
the metaclass EntityMeta intercepts type.__new__
and walks the class annotations. For each annotation whose
__origin__ is one of the NextORM field markers (PK, Req, Opt,
Local, Set, Single) it:
Creates a descriptor (
FieldDescriptor,SetDescriptor,SingleDescriptor, orLocalDescriptor) and replaces the annotation placeholder on the class.Records metadata in the class-level dicts
_fields_,_relations_, and_locals_.Registers the entity in the global
_entity_registrysoDatabasecan discover it without explicitregister()calls.
Type checking vs. runtime¶
The PK, Req, Opt, Set, and Single markers are real Python
generic classes with custom __get__ / __set__. Type checkers (pyright,
mypy) see them as typed descriptors — PK[int].__get__ returns ColumnExpr
at the class level and int at the instance level. At runtime, however,
EntityMeta.__new__ replaces each annotated attribute with a real
descriptor object, so the __get__ / __set__ on the marker classes are
never called.
Identity map and session cache¶
SessionCache is a lightweight in-process object store:
_objects—dict[(EntityClass, pk) → entity_instance]_to_save— ordered list of new (unsaved) entity instances_dirty— set of modified entity instances
When an entity attribute is written via FieldDescriptor.__set__, the
descriptor calls cache.mark_dirty(instance) on the current session. On
session exit (or explicit flush()) all dirty and
pending-save entities are written through save().
The session stack is stored in a threading.local variable for sync code
and a ContextVar for async code. The
DBSessionManager reads whichever stack is active.
Database and provider abstraction¶
Database owns a single persistent connection
(or a ConnectionPool) and a reference to a
SyncProvider. The provider handles all SQL
dialect variations:
Database ─── SyncProvider ─── sqlite3 / psycopg / PyMySQL
└──── DDLRenderer (SQLiteRenderer / PostgresRenderer / MariaDBRenderer)
AsyncDatabase is structurally identical but
inherits the provider from the async provider registry and uses
aiosqlite / psycopg[async] / asyncmy.
Schema layer¶
The schema layer (nextorm.schema) produces Table
objects from entity metadata. build_schema(entities) walks each entity’s
_fields_ and _relations_ and produces the full table+column+FK graph.
DDLRenderer serialises this graph to SQL CREATE TABLE statements.
diff_schemas compares two serialised schema dicts and emits ALTER TABLE
statements — this is what makemigrations uses to auto-generate migration files.
Query compilation¶
A QuerySet builds an AST of immutable SQL node objects
from nextorm.sql.nodes:
QuerySet
├── _entity_class → provides column names
├── _where → BinOp (AST tree)
├── _order → tuple[OrderItem]
├── _joins → tuple[(join_type, table, alias, BinOp)]
└── _lim / _off → int | None
When a terminal method (fetch_all(), count(), …) is called, the
SQLBuilder renders the AST to a parameterised
SQL string and hands it to Database._execute().
Each QuerySet method returns a shallow clone so the original is never
mutated — the builder pattern is fully immutable.
Generator-expression decompiler¶
The select() function accepts a generator expression
and decompiles its bytecode back to a filter predicate. It uses
dis to walk the bytecode instructions and map comparison
operations (COMPARE_OP, BINARY_OP) to BinOp
AST nodes.
Because bytecode layouts differ between Python versions, the decompiler
normalises opcode names per-version. Complex Python expressions (function
calls, multi-level attribute chains) raise
DecompileError — the filter-based API handles those
cases.
Flush and commit pipeline¶
The session-exit path implements a PonyORM-style staged commit:
Session.__exit__
│
├── [clean exit]
│ ├── _collect_dbs() ← scans _objects + _to_save + _dirty
│ ├── for db in dbs: db.flush() ← write all pending changes
│ │ └── on failure → rollback all, re-raise
│ ├── primary._commit_transaction()
│ │ └── on failure → rollback secondaries, re-raise
│ └── for db in secondaries: db._commit_transaction()
│ └── each failure appended; first one re-raised at end
│
└── [exception]
└── for db in dbs: db._rollback_transaction() (errors swallowed)
_collect_dbs discovers database instances by inspecting the _db_
instance variable of every entity in the session cache (identity map, pending
save queue, and dirty set).
Optimistic concurrency¶
When optimistic=True (the default), NextORM tracks which columns were
read since the entity was loaded (stored in instance.__dict__["_read_cols_"]).
On UPDATE, it appends AND col = original_value for each read column.
If the row was changed by another transaction in the interim, the update
matches zero rows and OptimisticCheckError is raised.
This implements per-field optimistic concurrency checking, matching PonyORM’s semantics.
Single-table inheritance¶
STI entities share one database table. EntityMeta looks for
_discriminator_col_ on the parent class and _discriminator_val_ on each
subclass.
SELECT — a
WHERE kind = 'dog'clause is automatically appended when querying a subclass, and the correct Python class is chosen when loading rows from the parent class based on the discriminator value.INSERT — the discriminator column is included in the INSERT with the subclass’s configured value.
Schema — the parent class owns the table; subclass columns are merged in.
Provider system¶
Providers are registered in a global registry keyed by name. You can register custom providers:
from nextorm.providers import register_provider
from mylib.provider import MySyncProvider
register_provider("mydb", MySyncProvider)
A SyncProvider must implement:
class SyncProvider:
def connect(self, *args, **kwargs) -> SyncConnection: ...
def last_insert_id(self, cursor: SyncCursor) -> int | None: ...
def sql_builder_class(self) -> type[SQLBuilder]: ...
def ddl_renderer_class(self) -> type[DDLRenderer]: ...
def paramstyle(self) -> str: ... # "qmark" | "format" | "pyformat"
Connection lifecycle¶
Without pooling, Database maintains one persistent
connection opened at first use (via _ensure_connection). When pooling is
enabled, each _execute / _execute_dml cycle checks out a connection,
runs the SQL, and immediately returns it.
The same model applies to AsyncDatabase using
AsyncConnectionPool.