Design Alterations - The DataJoint Book

After defining a table, modifications to its table definition in the DataJoint table class will have no effect. The definition string is only used at table definition time and is ignored if the corresponding table already exists.

Changes in schema design are often accomplished by schema migration, i.e. defining new tables and migrating the data into them and then dropping the old tables.

import datajoint as dj
schema = dj.schema('university')
schema.spawn_missing_classes()

Connecting dimitri@localhost:3306

dj.Diagram(schema)

Let’s define table Exam to contain the results of midterm and final exams.

@schema
class Exam(dj.Manual):
    definition = """
    -> Enroll 
    ---
    score  : decimal(5,2)   # percent of total 
    """

Exam()

dj.Diagram(schema)

Let’s change the definition to record the exam date too

@schema
class Exam(dj.Manual):
    definition = """
    -> Enroll
    ---
    exam_date : date
    score  : decimal(5,2)   # percent of total 
    """

Even though we updated the definition in the class, the change is not reflected on the server:

Exam()

You can examine the definition on the server using the describe method:

Exam.describe();

Exam.heading

One solution is to simply drop the table and declare it again, which is fine when it contains no valuable data. But let’s consider the case when the table is already populated and we wish to keep the existing data.

First, let’s insert some exam entries:

# pick 100 random enrollments from the current term
import random
keys = random.sample(((Enroll - Exam) & CurrentTerm).fetch('KEY'), 100)
# assign random scores
for key in keys:
    Exam.insert1(dict(key, score=random.randint(0,10000)/100))

Now we can alter the Exam table with the new definition:

Exam()

We can now use the alter method to apply the new definition:

dj.__version__

Exam.alter()

Oh! New attributes cannot be added to tables with existing data without providing a default value. Let’s update the definition to allow exam_date to be empty (default to null) and alter the table again.

@schema
class Exam(dj.Manual):
    definition = """
    -> Enroll 
    ---
    exam_date = null: date  
    score  : decimal(5,2)   # percent of total 
    """

Exam.alter()

Exam()

Now let’s add some grades for today’s exam:

from datetime import datetime
today = datetime.now().date().isoformat()

today

# pick 20 random enrollments from the current term
keys = random.sample(((Enroll - Exam) & CurrentTerm).fetch('KEY'), 20)
# assign random scores
for key in keys:
    Exam.insert1(dict(key, score=random.randint(0,10000)/100, exam_date=today))

Exam()

Now let’s say we want to rename score into exam_score:

@schema
class Exam(dj.Manual):
    definition = """
    -> Enroll 
    ---
    exam_date = null: date  
    exam_score  : decimal(5,2)   # percent of total 
    """

# Say NO!
Exam.alter()

Note that rather than renaming, alter attempted to drop the old attribute and add the new one.

To rename, we must indicate the old attribute name in curly brackets as the first thing in the attribute comment:

@schema
class Exam(dj.Manual):
    definition = """
    -> Enroll 
    ---
    exam_score  : decimal(5,2)   # percent of total 
    exam_date = null: date  
    """

Exam.alter()

Exam()

The old attribute name is preserved in the comment just in case, but you can remove it with another alter command:

Exam.describe();

@schema
class Exam(dj.Manual):
    definition = """
    -> Enroll 
    ---
    exam_date = null: date  
    exam_score  : decimal(6,3)   # percent of total 
    photocopy: longb  
    """

Exam.alter()

Exam()

Exam.describe();

We can change the data type of an existing attribute as long as the existing data can fit. However, some precision may be lost.

For example, let’s attempt to change the exam_score attribute into a tinyint unsigned. This will result in the rounding of the scores:

@schema
class Exam(dj.Manual):
    definition = """
    -> Enroll 
    ---
    exam_date = null: date  
    exam_score  : tinyint unsigned  # percent of total 
    """

Exam.alter()

Exam()

But if we attmept to change it to decimal(3,2), for example, we may get an error because some values are too large to fit in this data type.

@schema
class Exam(dj.Manual):
    definition = """
    -> Enroll 
    ---
    exam_date = null: date  
    exam_score  : decimal(3,2)   # percent of total 
    """

Exam.alter()

@schema
class Exam(dj.Manual):
    definition = """
    -> Enroll 
    ---
    exam_date = null: date  
    exam_score  : decimal(5,2)   # percent of total 
    """

Exam.alter()

# restored to higher precision but fractional part has been lost:
Exam()

Finally, let’s add a table comment and reorder the attributes:

@schema
class Exam(dj.Manual):
    definition = """
    # Exam taken by a student enrolled in a course section and the grade
    -> Enroll 
    ---
    exam_score  : decimal(5,2)   # percent of total 
    """

Exam.alter()

Exam()

#clean up
Exam.drop()

Schema Design

Adapted Attribute Types

Example Designs

Schema Examples