Skip to article frontmatterSkip to article content

Create Tables

Tables: The Foundation of Data Integrity

In the previous chapter, we learned that relational databases excel at enforcing data integrity constraints. Tables are where these constraints come to life through:

  1. Domain Integrity — Data types restrict values to valid ranges

  2. Completeness — Required vs. optional attributes ensure necessary data is present

  3. Entity Integrity — Primary keys uniquely identify each record (covered in Primary Keys)

  4. Referential Integrity — Foreign keys enforce relationships (covered in Foreign Keys)

This chapter shows how to declare tables in DataJoint with proper data types and attribute specifications that enforce these constraints automatically.

Prerequisites: Schema Declaration

Before creating tables, we must declare a schema object that provides a namespace for our tables (see Create Schemas):

import datajoint as dj

# Define the schema namespace
schema = dj.Schema('tutorial')

Table Declaration Syntax

To declare a table in DataJoint, define a Python class with:

  1. Schema decorator@schema associates the table with the schema

  2. Base class — Inherit from dj.Manual, dj.Lookup, dj.Imported, or dj.Computed

  3. Definition string — Multi-line string specifying attributes and types

Here’s a basic example:

@schema
class Experiment(dj.Manual):
    definition = """
    experiment_id: int      # Unique identifier for each experiment
    ---
    experiment_name: varchar(64)   # Name of the experiment
    experiment_date: date          # Date when the experiment was conducted
    """

Anatomy of a Table Definition

Table Name: The Python class name (Experiment) becomes the table name in the database.

  • Convention: Singular, CamelCase (e.g., RecordingSession, not recording_sessions)

  • No underscores in multi-word names

Attribute Specification: Each line defines one attribute (column):

attribute_name : datatype  # optional comment

Primary Key Separator: The line --- divides:

  • Above ---: Primary key attributes (jointly uniquely identify each entity)

  • Below ---: Dependent attributes (describe the entity)

Comments: Text after # documents the attribute’s meaning

Data Types: Enforcing Domain Integrity

DataJoint uses MySQL data types to enforce domain integrity — ensuring values are within valid ranges. Here are the most commonly used types:

Numeric Types

Integer Types (for whole numbers):

  • tinyint — Range: -128 to 127 (or 0 to 255 if unsigned)

  • smallint — Range: -32,768 to 32,767 (or 0 to 65,535 if unsigned)

  • int — Range: -2.1B to 2.1B (or 0 to 4.3B if unsigned)

  • bigint — Range: -9.2E18 to 9.2E18 (or 0 to 1.8E19 if unsigned)

Floating-Point Types (for approximate decimal numbers):

  • float — 32-bit floating point (~7 decimal digits precision)

  • double — 64-bit floating point (~15 decimal digits precision)

Fixed-Point Type (for exact decimal numbers):

  • decimal(M, D) — M total digits, D after decimal point

    • Example: decimal(10, 2) for monetary values like 12345678.90

    • Can be unsigned for non-negative values only

Temporal Types

  • date — Date only: YYYY-MM-DD (e.g., 2024-10-31)

  • time — Time only: HH:MM:SS (e.g., 14:30:00)

  • datetime — Date and time: YYYY-MM-DD HH:MM:SS

  • timestamp — Unix timestamp (seconds since 1970-01-01, auto-updates)

String Types

  • char(M) — Fixed-length string, padded to M characters (max 255)

    • Use for: Fixed-length codes (e.g., char(2) for US state codes)

  • varchar(M) — Variable-length string, up to M characters (max 65,535)

    • Use for: Names, descriptions, free text

  • enum('val1', 'val2', ...) — Restricts values to specified options

    • Use for: Categorical data with known set (e.g., experimental conditions)

Example with Multiple Data Types

@schema
class Mouse(dj.Manual):
    definition = """
    mouse_id: int unsigned          # Unique mouse identifier (1-4B range)
    ---
    mouse_name: varchar(50)         # Mouse name or nickname
    date_of_birth: date             # Birth date
    sex: enum('M', 'F', 'U')        # Sex: Male, Female, or Unknown
    weight: decimal(5,2) unsigned   # Weight in grams (0.00-999.99)
    """

This definition enforces:

  • mouse_id must be a non-negative integer

  • mouse_name cannot exceed 50 characters

  • date_of_birth must be a valid date

  • sex can only be ‘M’, ‘F’, or ‘U’

  • weight must be a non-negative decimal with 2 decimal places

The database automatically rejects any insert that violates these constraints.

Inserting Data

Once a table is declared, insert data using:

insert1(record) — Insert a single record (dictionary)

Experiment.insert1({
    'experiment_id': 1,
    'experiment_name': 'Spatial Navigation Task',
    'experiment_date': '2024-10-31'
})

insert(records) — Insert multiple records (list of dictionaries)

Experiment.insert([
    {'experiment_id': 2, 'experiment_name': 'Fear Conditioning', 'experiment_date': '2024-11-01'},
    {'experiment_id': 3, 'experiment_name': 'Object Recognition', 'experiment_date': '2024-11-02'}
])

Viewing Table Contents

View Table Structure

Reference the table class to see its definition:

Experiment

View Table Data

Call the table class (with parentheses) to see its contents:

Experiment()  # Shows all records in table

Output displays:

  • Asterisk (*) marks primary key attributes

  • All attribute values for each record

  • Total record count

Deleting Records

Delete Specific Records

Use the restriction operator & with primary key values:

# Delete experiment with ID 1
(Experiment & {'experiment_id': 1}).delete()

You’ll be prompted to confirm before deletion is committed.

Delete All Records

Call .delete() on the table class:

Experiment.delete()  # Prompts for confirmation

Dropping Tables

Dropping a table permanently removes it and all its data from the database:

Experiment.drop()  # Prompts for confirmation

For information on dropping entire schemas, see Create Schemas.

Table Base Classes

DataJoint provides four base classes for different data management patterns:

Base ClassPurposeWhen to Use
dj.ManualManually entered dataSubject info, experimental protocols
dj.LookupReference data, rarely changesEquipment lists, parameter sets
dj.ImportedData imported from external filesRaw recordings, behavioral videos
dj.ComputedDerived from other tablesSpike sorting results, analyses

We’ll explore Imported and Computed tables in the Computation section.

Summary

Tables are the fundamental building blocks where data integrity is enforced:

  1. Table declarations specify structure using the @schema decorator and definition string

  2. Data types enforce domain integrity by restricting values to valid ranges

  3. Primary keys (above ---) enforce entity integrity through unique identification

  4. Required attributes enforce completeness by ensuring necessary data is present

  5. DataJoint operations (insert, delete, drop) respect these integrity constraints