Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Delete

The delete operation removes entities from the database along with all their downstream dependents. This cascading behavior is fundamental to maintaining computational validity—the guarantee that derived data remains consistent with its inputs.

Cascading Delete and Computational Validity

In the Relational Workflow Model, every entity in a Computed or Imported table was derived from specific upstream data. If that upstream data is deleted or found to be incorrect, the derived results become meaningless—they are artifacts of inputs that no longer exist or were never valid.

DataJoint enforces this principle through cascading deletes:

Subject ← Session ← Recording ← SpikeSort ← UnitAnalysis
   │         │          │           │            │
   └─────────┴──────────┴───────────┴────────────┘
              Deleting a Session removes all of these

When you delete an entity:

  1. All entities that reference it (via foreign keys) are identified

  2. Those entities are recursively deleted

  3. The cascade continues through the entire dependency graph

  4. The final state is always referentially consistent

This is not merely cleanup—it is enforcing the semantics of the workflow. Computed results only have meaning in relation to their inputs.

The delete Method

<Table>.delete(safemode=True, quick=False)

Parameters:

  • safemode (default: True): Prompts for confirmation before deleting

  • quick (default: False): If True, skips dependency analysis (use with caution)

Examples:

# Delete with confirmation prompt
(Session & {'subject_id': 'M001', 'session_date': '2024-01-15'}).delete()

# Delete without confirmation (scripted use)
(Session & {'subject_id': 'M001', 'session_date': '2024-01-15'}).delete(safemode=False)

# Delete all entries in a table (with confirmation)
Session.delete()

Use Cases for Delete

1. Correcting Upstream Errors

The most common use of delete is correcting errors in upstream data. Rather than updating values (which would leave downstream computations inconsistent), you:

  1. Delete the incorrect upstream data (cascade removes all derived results)

  2. Insert the corrected data

  3. Repopulate to regenerate downstream computations

# Discovered an error in session metadata
(Session & bad_session_key).delete(safemode=False)

# Insert corrected data
Session.insert1(corrected_session_data)

# Regenerate all downstream analysis
Recording.populate()
SpikeSort.populate()
UnitAnalysis.populate()

2. Reprocessing with Updated Code

When you update your analysis code, you may want to regenerate computed results:

# Delete computed results to force recomputation
(SpikeSort & restriction).delete(safemode=False)

# Repopulate with updated make() method
SpikeSort.populate()

3. Removing Obsolete Data

When data is no longer needed:

# Remove old pilot data
(Subject & 'subject_id LIKE "pilot%"').delete()

4. Selective Deletion with Restrictions

Use DataJoint’s restriction syntax to target specific subsets:

# Delete only failed recordings
(Recording & 'quality < 0.5').delete()

# Delete sessions from a specific date range
(Session & 'session_date < "2023-01-01"').delete()

# Delete based on joined conditions
(SpikeSort & (Recording & 'brain_region = "V1"')).delete()

The Delete-Reinsert-Repopulate Pattern

This pattern is the standard way to handle corrections in DataJoint:

def correct_session(session_key, corrected_data):
    """Correct session data and regenerate all downstream analysis."""
    
    # 1. Delete the session (cascades to all downstream)
    (Session & session_key).delete(safemode=False)
    
    # 2. Insert corrected data
    Session.insert1(corrected_data)
    
    # 3. Repopulate downstream tables
    # DataJoint's populate() automatically determines what needs to run
    Recording.populate()
    ProcessedRecording.populate()
    Analysis.populate()

This pattern ensures:

  • No orphaned or inconsistent computed results

  • Full audit trail (original data is gone, not hidden)

  • All downstream results reflect the corrected inputs

Preview Before Deleting

Always verify what will be deleted before executing:

# First, check what matches your restriction
Session & {'subject_id': 'M001'}

# Check downstream dependencies that will also be deleted
(Session & {'subject_id': 'M001'}).descendants()

# Then delete when confident
(Session & {'subject_id': 'M001'}).delete()

Safety Mechanisms

DataJoint provides several safeguards:

  1. safemode=True (default): Requires interactive confirmation showing what will be deleted

  2. Dependency preview: Shows the count of entries in dependent tables that will be affected

  3. Transaction wrapping: The entire cascading delete is atomic—it either fully succeeds or fully rolls back

Best Practices

  1. Trust the cascade: Don’t manually delete downstream tables first—let DataJoint handle dependencies

  2. Use restrictions: Target specific subsets rather than deleting entire tables

  3. Preview first: Check what matches before deleting, especially with complex restrictions

  4. Keep safemode=True for interactive work: Only use safemode=False in tested scripts

  5. Think in terms of workflow: Deleting is not “cleaning up”—it’s rolling back the workflow to an earlier state

  6. Follow with repopulate: After correcting data, run populate() to bring the pipeline back to a complete state