Master-part relationships express the idea that a given entity (master) may include several tightly-coupled component entities (parts) spread across multiple tables. This notion is also described as compositional integrity.
In DataJoint’s relational workflow philosophy, a master-part relationship expresses the notion of populating multiple related data artifacts in a single workflow step.
For example, a purchase order may include several items that should be treated as indivisible components of the purchase order. Another example is a measurement from several channels: all must be recorded jointly before any downstream processing can begin.
When inserting or deleting a master entity with all its parts, the database client must do so as a single all-or-nothing (atomic) transaction so that the master entity always appears with all its parts. Creating the master with any of its parts missing would constitute a violation of compositional integrity.
Defining a Master-Part Relationship¶
Consider the example of a schema describing polygons which are defined by the coordinaates of its vertices.
Here Polygon/Vertex is a master-part relationship.
%xmode minimal
import datajoint as dj
schema = dj.Schema('polygons')
@schema
class Polygon(dj.Manual):
definition = """
polygon_id : int
"""
class Vertex(dj.Part):
definition = """
-> master
vertex_id : int
---
x : float
y : float
"""
# Explicit numeric datatypes such as float32 and uint16 are introduced in DataJoint 2.0.
# Earlier versions of datajoint-python use native mysql datatypes such as INT and SMALLINTException reporting mode: Minimal
dj.Diagram(schema)As seen in this example, DataJoint provides special syntax for defining master-part relationships:
Master tables are declared normally – The master entity is declared as any regular table by subclassing
dj.Manual/dj.Lookup/dj.Imported/dj.Computed. Thus a table becomes a master table by virtue of having part tables.Nested class definition – Parts are declared as a nested class inside its master class, subclassing
dj.Part. Thus the part tables are referred to by their full class name such asPolygon.Vertex. Their classes do not need to be wrapped with the@schemadecorator: the decorator of the master class is responsible for declaring all of its parts.Foreign key from part to master – The part tables declare a foreign key to its master directly or transitively through other parts. Inside the namespace of the master class, a special object named
mastercan be used to reference the master table. Thus the definition of theVertextable can declare the foreign key-> masteras an equivalent alias to-> Polygon—either will form a valid foreign key.Part tables can introduce new schema dimensions – Unlike auto-populated master tables which cannot introduce new dimensions (see Primary Keys), part tables can define new primary key attributes. In the example above,
Vertexintroduces thevertex_iddimension to identify individual vertices within each polygon. This is the mechanism by which computations can produce multiple output entities from a single input.Diagram notation – In schema diagrams, part tables are rendered without colored blocks. They appear as labels attached to the master node, emphasizing that they do not stand on their own. Part table names that introduce new dimensions are underlined, following the standard convention for dimension-defining tables.
Workflow semantics – For computed and imported tables, the master’s
make()method is responsible for inserting both the master row and all its parts within a single ACID transaction. This ensures compositional integrity is maintained automatically.
Master-Part Semantics¶
The Master-Part relationship indicates to all client applications that inserts into the master and its parts must be done inside a dedicated transaction.
Structural rules:
Transactions cannot be nested and neither can master-part relationships. A part table cannot be a master table in another relationship.
Parts can only have one master. However, a master table can have multiple part tables.
All parts must declare a foreign key to the master, although they can do so transitively through other parts.
Dependency semantics:
A foreign key made by a downstream table to the master signifies a dependency on the entire collection of all its parts. This is a crucial property: when a table depends on a master, it implicitly depends on all the master’s parts as well. The downstream table can safely assume that whenever the master entry exists, all its associated parts are present and complete.
Deletion behavior:
Deleting a master entry naturally cascades to all its parts due to the foreign key constraint.
Parts cannot be deleted without deleting their master. Direct deletes of the parts are prohibited.
For manual and lookup tables:
At insert time, DataJoint does not enforce the master-part semantics for manual and lookup tables. The master-part notation only signals to the client applications that they must use transactions when inserting records into masters and their parts.
Master-Part in Computed Tables¶
Master-part relationships are most powerful in auto-computed tables (dj.Computed or dj.Imported).
The master is responsible for populating all its parts within a single make call.
Schema Dimensions in Computed Tables¶
Auto-populated tables have a fundamental constraint: they cannot introduce new schema dimensions directly. Their primary key must be fully determined by foreign keys to their upstream dependencies. This ensures that the key source (the set of entities to be computed) is well-defined.
However, computations often produce multiple output entities from a single input—detecting multiple cells in an image, extracting multiple spikes from a recording, or identifying multiple vertices in a polygon. Part tables solve this by being allowed to introduce new dimensions.
In the blob detection example below, Detection (the master) inherits its primary key entirely from Image and BlobParamSet. It cannot add new dimensions. But Detection.Blob (the part) introduces the blob_id dimension to identify individual blobs within each detection.
ACID Transactions¶
When populate is called, DataJoint executes each make() method inside an ACID transaction:
Atomicity – The entire
makecall is all-or-nothing. Either the master row and all its parts are inserted together, or none of them are. If any error occurs—whether in computing results, inserting the master, or inserting any part—the entire transaction is rolled back. No partial results are ever committed to the database.Consistency – The transaction moves the database from one valid state to another. The master-part relationship ensures that every master entry has its complete set of parts. Referential integrity constraints are satisfied at commit time.
Isolation – The transaction operates on a consistent snapshot of the database. Other concurrent transactions cannot see the partially inserted data until the transaction commits. This means other processes querying the database will never observe a master without its parts.
Durability – Once the transaction commits successfully, the data is permanently stored. Even if the system crashes immediately after, the master and all its parts will be present when the database restarts.
The Master’s Responsibility¶
The master’s make method is responsible for:
Fetching all necessary input data
Performing all computations
Inserting the master row
Inserting all part rows
This design ensures that the entire computation for one entity is self-contained within a single transactional boundary.
Example: Blob Detection¶
Consider the Blob Detection pipeline where Detection (master) and Detection.Blob (part) work together:
@schema
class Detection(dj.Computed):
definition = """
-> Image
-> BlobParamSet
---
nblobs : int
"""
class Blob(dj.Part):
definition = """
-> master
blob_id : int # NEW DIMENSION: identifies blobs within detection
---
x : float
y : float
r : float
"""
def make(self, key):
# fetch inputs
img = (Image & key).fetch1("image")
params = (BlobParamSet & key).fetch1()
# compute results
blobs = blob_doh(
img,
min_sigma=params['min_sigma'],
max_sigma=params['max_sigma'],
threshold=params['threshold'])
# insert master and parts (within one transaction)
self.insert1(dict(key, nblobs=len(blobs)))
self.Blob.insert(
(dict(key, blob_id=i, x=x, y=y, r=r)
for i, (x, y, r) in enumerate(blobs)))In this example:
The
makemethod is called once per(image_id, blob_paramset)combinationEach call runs inside its own ACID transaction
Detectioncannot introduce new dimensions—its primary key is fully inheritedDetection.Blobintroduces theblob_iddimension to identify each detected blobThe master row stores the aggregate blob count; the part rows store individual coordinates
If
blob_dohraises an exception or any insert fails, nothing is committedAn image with 200 detected blobs results in 1 master row + 200 part rows, all inserted atomically
This transactional guarantee means that any downstream table depending on Detection can trust that all Detection.Blob rows for that detection are present.
Dependency on Master Implies Dependency on Parts¶
A key property of master-part relationships is that a dependency on the master is also a dependency on all its parts.
When a downstream table declares a foreign key to a master table, it can safely assume that all the master’s parts are present and complete. This is guaranteed by the transactional semantics: the master and its parts are always inserted together atomically.
Example: SelectDetection¶
In the Blob Detection example, SelectDetection allows users to mark the optimal parameter set for each image:
@schema
class SelectDetection(dj.Manual):
definition = """
-> Image
---
-> Detection
"""The foreign key -> Detection establishes a dependency on the Detection master table.
Although SelectDetection does not explicitly reference Detection.Blob, it implicitly depends on all blobs for the referenced detection.
This has important implications:
Data availability – When querying
SelectDetection, you can join withDetection.Blobknowing that all blob coordinates for the selected detection exist:# Get all blobs for selected detections Detection.Blob & SelectDetectionCascading deletes – If a
Detectionentry is deleted, itsDetection.Blobparts are automatically deleted (due to the part’s foreign key to master). TheSelectDetectionentry referencing that detection is also deleted (due toSelectDetection’s foreign key toDetection). The entire dependency chain is maintained.Workflow integrity – Downstream computed tables can depend on the master and freely access both master attributes and part details. The workflow guarantees that if the master exists, all its computational results (stored in parts) are complete.
Why This Matters¶
This design pattern enables clean separation of concerns:
The master row stores aggregate or summary information (e.g., total blob count)
The part rows store detailed, per-item information (e.g., coordinates of each blob)
Downstream tables reference only the master, keeping their definitions simple
Queries can access part details through joins when needed
The transactional guarantee ensures this separation never leads to inconsistent states where a master exists without its parts.
Practical Guidelines¶
When to use master-part:
Tightly-coupled detail rows – If the part data never exists without the master, implement it as a nested part rather than a separate table. Examples include waveform channels, spike units per recording, order lines for a purchase order, detected features in an image, or parameter sweeps attached to a model fit.
Aggregate + detail pattern – When computations produce both summary statistics (master) and per-item details (parts), master-part is the natural choice.
Atomic multi-row results – When a single computation produces multiple rows that must appear together or not at all.
Implementation best practices:
Keep all logic inside
make()– Populate the master and insert all parts from within the master’smake()method. Do not create separate processes that attempt to fill the part tables independently—the transactional guarantees rely on this pattern.Insert master before parts – While both are within the same transaction, inserting the master first ensures the foreign key constraint from parts to master is satisfied.
Downstream tables reference the master – Declare foreign keys to the master table, not to individual parts. This keeps definitions clean and leverages the implicit dependency on all parts.
Diagram awareness:
Part nodes appear without colored blocks in the diagram, rendered as labels attached to the master node.
Use this visual cue to distinguish between independent entity tables and parts.
Diagramming utilities provide the option of hiding all parts for a simplified view.
Summary¶
Master-part relationships provide a structured way to model entities that own subordinate detail rows. Key principles:
Compositional integrity – A master and its parts form an indivisible unit. They are inserted and deleted together, never partially.
ACID transactions – Each
make()call runs inside a transaction guaranteeing atomicity, consistency, isolation, and durability. If any step fails, the entire operation is rolled back.Master’s responsibility – The master’s
make()method is solely responsible for populating itself and all its parts. This keeps the transactional boundary clear and self-contained.Schema dimensions – Auto-populated master tables cannot introduce new dimensions; their primary key is fully inherited through foreign keys. Part tables can introduce new dimensions, enabling computations to produce multiple output entities from a single input.
Implicit part dependency – A foreign key to the master implies a dependency on all its parts. Downstream tables can safely assume that when the master exists, all its parts are present and complete.
Clean separation – Masters hold aggregate/summary data while parts hold per-item details. Downstream tables reference the master; queries join with parts when details are needed.
DataJoint’s nested class syntax and transactional populate mechanism make this pattern easy to express and safe to use in relational workflows.