Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Relationships

Relationships: Modeling Entity Connections

Relational databases model the real world by representing entities and the relationships between them. While the Foreign Keys chapter covered how foreign keys enforce referential integrity, this chapter focuses on the patterns that emerge from different combinations of uniqueness and referential constraints.

Understanding these patterns is essential for designing schemas that accurately represent business rules, data dependencies, and workflow structures.

Relationship Fundamentals

Relationships between tables are established through referential constraints (foreign keys) combined with uniqueness constraints (primary keys and unique indexes).

Relationship Building Blocks

Foreign keys establish connections between tables:

  • A foreign key in the child references the primary key of the parent

  • The parent side always has cardinality of exactly one

Uniqueness constraints determine cardinality on the child side:

  • No constraint on foreign key: many children per parent (one-to-many)

  • Unique constraint on foreign key: one child per parent (one-to-one)

  • Foreign key is entire primary key: at most one child per parent (one-to-one extension)

In DataJoint, foreign keys also participate in the relational workflow model: each dependency not only enforces referential integrity but also prescribes the order of operations. When table B references table A, A must be populated before B. The resulting schema is a directed acyclic graph (DAG) whose arrows describe both data relationships and workflow execution order.

Setup

First, we import DataJoint and create a schema for our examples:

import datajoint as dj

schema = dj.Schema('relationships_tutorial')

One-to-Many Relationships

The one-to-many relationship is the most common pattern: one parent entity can have multiple associated child entities, but each child belongs to exactly one parent.

Pattern 1: Foreign Key as Secondary Attribute

When the foreign key is a secondary attribute (below the --- line), the child table has its own independent primary key.

DataJoint
SQL
@schema
class Customer(dj.Manual):
    definition = """
    customer_id : int unsigned
    ---
    full_name : varchar(60)
    """

@schema
class Account(dj.Manual):
    definition = """
    account_id : int unsigned     # account's own identity
    ---
    -> Customer                   # foreign key as secondary attribute
    open_date : date
    balance : decimal(10,2)
    """

Characteristics:

  • Each account has its own independent identity (account_id)

  • Each account belongs to exactly one customer

  • Each customer can have multiple accounts (or none)

  • In diagrams: shown as a dashed line

# Pattern 1: Foreign key as secondary attribute (dashed line)
@schema
class Customer(dj.Manual):
    definition = """
    customer_id : int unsigned
    ---
    full_name : varchar(60)
    """

@schema
class AccountIndependent(dj.Manual):
    definition = """
    account_id : int unsigned     # account's own identity
    ---
    -> Customer                   # foreign key as secondary attribute
    open_date : date
    balance : decimal(10,2)
    """
# View the diagram - notice the dashed line (secondary FK)
dj.Diagram(Customer) + dj.Diagram(AccountIndependent)

Pattern 2: Foreign Key in Composite Primary Key

When the foreign key is part of the primary key (above ---), the child’s identity includes the parent’s identity.

DataJoint
SQL
@schema
class Customer(dj.Manual):
    definition = """
    customer_id : int unsigned
    ---
    full_name : varchar(60)
    """

@schema
class Account(dj.Manual):
    definition = """
    -> Customer                   # foreign key in primary key
    account_num : int unsigned    # account number within customer
    ---
    open_date : date
    balance : decimal(10,2)
    """

Characteristics:

  • Accounts are identified within the context of their customer

  • Account #3 for Customer A is different from Account #3 for Customer B

  • The primary key (customer_id, account_num) cascades through foreign keys

  • In diagrams: shown as a thin solid line

Key difference: With a dashed line, account_id must be globally unique. With a thin solid line, account_num only needs to be unique within each customer.

# Pattern 2: Foreign key in composite primary key (thin solid line)
@schema
class AccountContained(dj.Manual):
    definition = """
    -> Customer                   # foreign key in primary key
    account_num : int unsigned    # account number within customer
    ---
    open_date : date
    balance : decimal(10,2)
    """
# View the diagram - notice the thin solid line (FK in primary key)
dj.Diagram(Customer) + dj.Diagram(AccountContained)

Optional One-to-Many with Nullable Foreign Key

To allow children without a parent reference, use the nullable modifier:

DataJoint
SQL
@schema
class Account(dj.Manual):
    definition = """
    account_id : int unsigned
    ---
    -> [nullable] Customer        # optional owner
    open_date : date
    balance : decimal(10,2)
    """

# Accounts can exist without an assigned customer
Account.insert1({
    'account_id': 9999,
    'customer_id': None,          # no owner yet
    'open_date': '2024-01-01',
    'balance': 0.00
})

One-to-One Relationships

A one-to-one relationship ensures that each parent has at most one associated child. There are several ways to achieve this.

Pattern 1: Foreign Key as Entire Primary Key (Extension)

The strongest one-to-one relationship occurs when the foreign key is the entire primary key of the child table.

DataJoint
SQL
@schema
class Customer(dj.Manual):
    definition = """
    customer_id : int unsigned
    ---
    full_name : varchar(60)
    """

@schema
class CustomerPreferences(dj.Manual):
    definition = """
    -> Customer                   # foreign key IS the primary key
    ---
    theme : varchar(20)
    notification_email : varchar(100)
    """

Characteristics:

  • CustomerPreferences shares the same identity as Customer

  • At most one preferences record per customer

  • In diagrams: shown as a thick solid line

  • Table name is not underlined (no new dimension introduced)

Why use separate tables for one-to-one?

  • Modularity: Separate optional data from required data

  • Access control: Different permissions for different data

  • Avoiding NULL columns: Instead of nullable columns, use a separate table

  • Workflow stages: Each table represents a processing step

# Pattern 1: Foreign key as entire primary key (thick solid line)
@schema
class CustomerPreferences(dj.Manual):
    definition = """
    -> Customer                   # foreign key IS the primary key
    ---
    theme : varchar(20)
    notification_email : varchar(100)
    """
# View the diagram - notice the thick solid line (extension/one-to-one)
dj.Diagram(Customer) + dj.Diagram(CustomerPreferences)

Pattern 2: Unique Foreign Key (Reference)

Adding the unique modifier to a secondary foreign key creates a one-to-one relationship while maintaining independent identity:

DataJoint
SQL
@schema
class Employee(dj.Manual):
    definition = """
    employee_id : int unsigned
    ---
    full_name : varchar(60)
    """

@schema
class ParkingSpot(dj.Manual):
    definition = """
    spot_id : int unsigned        # spot has its own identity
    ---
    -> [unique] Employee          # at most one spot per employee
    location : varchar(30)
    """

Characteristics:

  • Each parking spot has its own identity (spot_id)

  • The unique constraint ensures one spot per employee maximum

  • In diagrams: shown as a dashed line (uniqueness not visible)

# Pattern 2: Unique foreign key (dashed line, uniqueness not visible)
@schema
class Employee(dj.Manual):
    definition = """
    employee_id : int unsigned
    ---
    full_name : varchar(60)
    """

@schema
class ParkingSpot(dj.Manual):
    definition = """
    spot_id : int unsigned        # spot has its own identity
    ---
    -> [unique] Employee          # at most one spot per employee
    location : varchar(30)
    """
# View the diagram - notice the dashed line (unique constraint NOT visible!)
dj.Diagram(Employee) + dj.Diagram(ParkingSpot)

Pattern 3: Optional One-to-One

Combine nullable and unique for an optional one-to-one relationship:

DataJoint
SQL
@schema
class ParkingSpot(dj.Manual):
    definition = """
    spot_id : int unsigned
    ---
    -> [nullable, unique] Employee   # optional, but exclusive
    location : varchar(30)
    """

Multiple spots can be unassigned (NULL), but each employee can be assigned at most one spot.

Many-to-Many Relationships

A many-to-many relationship allows entities from both sides to be connected to multiple entities on the other side. This requires an association table (also called junction table or bridge table).

Basic Association Table

DataJoint
SQL
@schema
class Student(dj.Manual):
    definition = """
    student_id : int unsigned
    ---
    student_name : varchar(60)
    """

@schema
class Course(dj.Manual):
    definition = """
    course_code : char(8)
    ---
    course_title : varchar(100)
    """

@schema
class Enrollment(dj.Manual):
    definition = """
    -> Student                    # part of composite primary key
    -> Course                     # part of composite primary key
    ---
    enrollment_date : date
    grade : enum('A', 'B', 'C', 'D', 'F', 'IP')
    """

Characteristics:

  • Enrollment has a composite primary key from both parents

  • Each student can enroll in many courses

  • Each course can have many students

  • Each student-course combination appears at most once

  • Association table can have its own attributes (grade, enrollment_date)

# Many-to-many with association table
@schema
class Student(dj.Manual):
    definition = """
    student_id : int unsigned
    ---
    student_name : varchar(60)
    """

@schema
class Course(dj.Manual):
    definition = """
    course_code : char(8)
    ---
    course_title : varchar(100)
    """

@schema
class Enrollment(dj.Manual):
    definition = """
    -> Student                    # part of composite primary key
    -> Course                     # part of composite primary key
    ---
    enrollment_date : date
    grade : enum('A', 'B', 'C', 'D', 'F', 'IP')
    """
# View the many-to-many diagram
dj.Diagram(Student) + dj.Diagram(Course) + dj.Diagram(Enrollment)

Constrained Many-to-Many

By moving one foreign key below --- with a unique constraint, you can create constrained relationships:

DataJoint
SQL
# Each customer has at most one account, but accounts can be shared
@schema
class CustomerAccount(dj.Manual):
    definition = """
    -> Customer                   # primary key
    ---
    -> Account                    # each customer links to one account
    """

# With unique constraint: each account belongs to at most one customer
@schema
class AccountOwnership(dj.Manual):
    definition = """
    -> Account                    # primary key
    ---
    -> [unique] Customer          # each customer owns at most one account
    """

Hierarchies

Hierarchies are cascading one-to-many relationships that create tree structures. Each level adds a new dimension to the composite primary key.

DataJoint
SQL
@schema
class Study(dj.Manual):
    definition = """
    study : varchar(8)            # study code
    ---
    investigator : varchar(60)
    study_description : varchar(255)
    """

@schema
class Subject(dj.Manual):
    definition = """
    -> Study
    subject_id : varchar(12)      # subject within study
    ---
    species : enum('human', 'primate', 'rodent')
    date_of_birth = null : date
    """

@schema
class Session(dj.Manual):
    definition = """
    -> Subject
    session : smallint unsigned   # session within subject
    ---
    session_date : date
    operator : varchar(60)
    """

@schema
class Scan(dj.Manual):
    definition = """
    -> Session
    scan : smallint unsigned      # scan within session
    ---
    scan_time : time
    scan_type : varchar(30)
    """

Key features of hierarchies:

  • Primary keys cascade through the hierarchy

  • Scan’s primary key is (study, subject_id, session, scan)

  • Direct joins work across any levels: Study * Scan is valid

  • In diagrams: chain of thin solid lines

This pattern is common in scientific data organization (BIDS, NWB) where data is structured as Study → Subject → Session → Data.

# Hierarchy pattern: cascading one-to-many relationships
@schema
class Study(dj.Manual):
    definition = """
    study : varchar(8)            # study code
    ---
    investigator : varchar(60)
    study_description : varchar(255)
    """

@schema
class Subject(dj.Manual):
    definition = """
    -> Study
    subject_id : varchar(12)      # subject within study
    ---
    species : enum('human', 'primate', 'rodent')
    date_of_birth = null : date
    """

@schema
class Session(dj.Manual):
    definition = """
    -> Subject
    session : smallint unsigned   # session within subject
    ---
    session_date : date
    operator : varchar(60)
    """

@schema
class Scan(dj.Manual):
    definition = """
    -> Session
    scan : smallint unsigned      # scan within session
    ---
    scan_time : time
    scan_type : varchar(30)
    """
# View the hierarchy diagram - chain of thin solid lines
dj.Diagram(Study) + dj.Diagram(Subject) + dj.Diagram(Session) + dj.Diagram(Scan)

Sequences

Sequences are cascading one-to-one relationships representing workflow steps. Each step extends the identity of the previous step.

DataJoint
SQL
@schema
class Order(dj.Manual):
    definition = """
    order_id : int unsigned
    ---
    order_date : date
    customer : varchar(60)
    """

@schema
class Shipment(dj.Manual):
    definition = """
    -> Order                      # same identity as Order
    ---
    ship_date : date
    carrier : varchar(30)
    """

@schema
class Delivery(dj.Manual):
    definition = """
    -> Shipment                   # same identity as Shipment (and Order)
    ---
    delivery_date : date
    signature : varchar(60)
    """

Key features of sequences:

  • All tables share the same primary key (order_id)

  • Each step is optional—not every order is shipped, not every shipment is delivered

  • Direct queries across steps: Order * Delivery works without including Shipment

  • In diagrams: chain of thick solid lines

# Sequence pattern: cascading one-to-one relationships
@schema
class Order(dj.Manual):
    definition = """
    order_id : int unsigned
    ---
    order_date : date
    customer : varchar(60)
    """

@schema
class Shipment(dj.Manual):
    definition = """
    -> Order                      # same identity as Order
    ---
    ship_date : date
    carrier : varchar(30)
    """

@schema
class Delivery(dj.Manual):
    definition = """
    -> Shipment                   # same identity as Shipment (and Order)
    ---
    delivery_date : date
    signature : varchar(60)
    """
# View the sequence diagram - chain of thick solid lines
dj.Diagram(Order) + dj.Diagram(Shipment) + dj.Diagram(Delivery)

Parameterization

The parameterization pattern applies different methods, algorithms, or parameters to the same entities. The association table itself becomes the entity of interest.

DataJoint
SQL
@schema
class Image(dj.Manual):
    definition = """
    image_id : int unsigned
    ---
    raw_image : longblob
    """

@schema
class EnhanceMethod(dj.Lookup):
    definition = """
    method_id : int unsigned
    ---
    method_name : varchar(30)
    method_description : varchar(255)
    """
    contents = [
        (1, 'sharpen', 'Sharpen edges using unsharp mask'),
        (2, 'denoise', 'Remove noise using median filter'),
        (3, 'contrast', 'Enhance contrast using histogram equalization'),
    ]

@schema
class EnhancedImage(dj.Computed):
    definition = """
    -> Image
    -> EnhanceMethod
    ---
    enhanced_image : longblob
    processing_time : float
    """

Characteristics:

  • Same image processed with multiple methods

  • Same method applied to multiple images

  • Results stored with composite key (image_id, method_id)

  • Typical in computational workflows with parameter sweeps

# Parameterization pattern
@schema
class Image(dj.Manual):
    definition = """
    image_id : int unsigned
    ---
    raw_image : longblob
    """

@schema
class EnhanceMethod(dj.Lookup):
    definition = """
    method_id : int unsigned
    ---
    method_name : varchar(30)
    method_description : varchar(255)
    """
    contents = [
        (1, 'sharpen', 'Sharpen edges using unsharp mask'),
        (2, 'denoise', 'Remove noise using median filter'),
        (3, 'contrast', 'Enhance contrast using histogram equalization'),
    ]

@schema
class EnhancedImage(dj.Computed):
    definition = """
    -> Image
    -> EnhanceMethod
    ---
    enhanced_image : longblob
    processing_time : float
    """
# View the parameterization diagram
dj.Diagram(Image) + dj.Diagram(EnhanceMethod) + dj.Diagram(EnhancedImage)

Directed Graphs

Directed graphs model relationships where entities of the same type connect to each other. Use renamed foreign keys to reference the same parent table multiple times.

DataJoint
SQL
@schema
class Neuron(dj.Manual):
    definition = """
    neuron_id : int unsigned
    ---
    neuron_type : enum('excitatory', 'inhibitory')
    layer : tinyint unsigned
    """

@schema
class Synapse(dj.Manual):
    definition = """
    synapse_id : int unsigned
    ---
    -> Neuron.proj(presynaptic='neuron_id')
    -> Neuron.proj(postsynaptic='neuron_id')
    strength : float
    synapse_type : varchar(30)
    """

The .proj() operator renames the foreign key attribute:

  • presynaptic references Neuron.neuron_id

  • postsynaptic references Neuron.neuron_id

In diagrams, orange dots indicate renamed foreign keys.

Other examples:

  • Employees and managers (both are employees)

  • Cities connected by flights

  • Users following other users

# Directed graph pattern with renamed foreign keys
@schema
class Neuron(dj.Manual):
    definition = """
    neuron_id : int unsigned
    ---
    neuron_type : enum('excitatory', 'inhibitory')
    layer : tinyint unsigned
    """

@schema
class Synapse(dj.Manual):
    definition = """
    synapse_id : int unsigned
    ---
    -> Neuron.proj(presynaptic='neuron_id')
    -> Neuron.proj(postsynaptic='neuron_id')
    strength : float
    synapse_type : varchar(30)
    """
# View the diagram - notice the orange dots indicating renamed foreign keys
dj.Diagram(Neuron) + dj.Diagram(Synapse)

Design Puzzle: Unique Designation

A common challenge: how to designate exactly one special item among many? For example: each state has many cities, but exactly one capital.

Requirements:

  1. Each city belongs to exactly one state

  2. Each state has exactly one capital

  3. A capital must be a city in that state

DataJoint
SQL
@schema
class State(dj.Manual):
    definition = """
    state : char(2)               # two-letter state code
    ---
    state_name : varchar(30)
    """

@schema
class City(dj.Manual):
    definition = """
    -> State
    city_name : varchar(60)
    ---
    population : int unsigned
    capital = null : enum('YES')  # nullable enum for designation
    unique index(state, capital)  # only one 'YES' per state
    """

How it works:

  • capital is NULL for non-capitals (most cities)

  • capital = 'YES' for the capital city

  • unique index(state, capital) ensures only one ‘YES’ per state

  • NULL values don’t violate uniqueness (multiple NULLs allowed)

This pattern works for team captains, default addresses, primary contacts, etc.

# Unique designation pattern
@schema
class State(dj.Manual):
    definition = """
    state : char(2)               # two-letter state code
    ---
    state_name : varchar(30)
    """

@schema
class City(dj.Manual):
    definition = """
    -> State
    city_name : varchar(60)
    ---
    population : int unsigned
    capital = null : enum('YES')  # nullable enum for designation
    unique index(state, capital)  # only one 'YES' per state
    """
# View the diagram
dj.Diagram(State) + dj.Diagram(City)

Relationship Summary

PatternForeign Key PositionConstraintCardinalityDiagram Line
One-to-many (reference)SecondaryNone1:NDashed
One-to-many (containment)Part of PKNone1:NThin solid
One-to-one (extension)Entire PKInherent1:1Thick solid
One-to-one (reference)Secondaryunique1:1Dashed
Optional relationshipSecondarynullable1:0..NDashed
Optional one-to-oneSecondarynullable, unique1:0..1Dashed
Many-to-manyBoth in PKNoneM:NTwo thin solids

Complete Schema Diagram

View all the tables we’ve created in this tutorial:

# View the entire schema
dj.Diagram(schema)

Cleanup

Optionally drop the tutorial schema when done:

# Uncomment to drop the schema
# schema.drop()