Classic Sales - The DataJoint Book

This is the classic sales database provided as part of the MySQL tutorial. http://www.mysqltutorial.org/mysql-sample-database.aspx

This example is adapated here without change under fair use for the purposes of instruction.

Background¶

The Classic Models dataset is a well-known sample schema that ships with many SQL tutorials. It captures the operations of a miniature wholesaler: offices, employees, customers, orders, order line items, product lines, and payments. Because the design is already normalized into clean entity sets, it is a convenient playground for illustrating DataJoint’s relational workflow concepts.

In this notebook we:

Load the canonical schema exactly as published by the MySQL team so you can compare the DataJoint rendition with the original SQL definitions.
Highlight the workflow perspective: foreign keys organize the tables into a directed acyclic graph (customers → orders → payments, product lines → products → order details, etc.). This makes it easy to trace the sequence of business operations.
Demonstrate interoperability: we ingest the SQL dump with %sql, then use schema.spawn_missing_classes() to materialize DataJoint table classes directly from the existing relational structure.

Keep the Relational Workflows and Relationships chapters in mind as you work through this example; you will see the same principles—normalized entity tables, association tables, and workflow-directed foreign keys—applied to a realistic business domain.

Loading the dataset¶

The next few cells perform the one-time setup:

Connect to the MySQL server using the %sql magic.
Execute the original classicsales.sql script to create and populate the tables exactly as described in the tutorial.
Call schema.spawn_missing_classes() so DataJoint introspects the existing schema and generates Python classes that follow the same relational structure.

Because we run against the canonical SQL definitions, you can compare the DataJoint diagrams directly with the diagrams that appear in business-database textbooks. It also illustrates how DataJoint fits into an existing workflow—you can adopt legacy schemas while gaining the benefits of workflow execution, cascading deletes, and DAG-based querying.

%load_ext sql
%sql mysql+pymysql://dev:devpass@db

The sql extension is already loaded. To reload it, use:
  %reload_ext sql

%sql --file classicsales.sql

import datajoint as dj
schema = dj.Schema('classic_sales')
schema.spawn_missing_classes()

dj.Diagram(schema)

Reading the schema diagram¶

The diagram that appears above is the DataJoint view of the Classic Models schema. Notice how the arrows flow from offices and employees down to customers, orders, and order details—exactly the workflow described in the earlier chapters:

Offices → EmployeePosition → Customer mirrors the sales workflow: employees are assigned to offices, then to customer accounts.
Customer → Order → Order.Details captures the order lifecycle, with each dependency ensuring that parent records exist before child records are inserted.
ProductLine → Product → Order.Details shows the product catalog driving each line item. Because the primary keys cascade, you can query across the entire chain with joins such as ProductLine * Order.Details without explicitly listing intermediate tables.

This DAG structure is what makes it easy to apply DataJoint’s relational workflow operators to a traditional business database.

Payment()

Next steps¶

From here you can explore the database using either SQL (%sql) or DataJoint operators. Because the schema follows the relational workflow model, joins such as Order * Payment or Product * Order.Details naturally trace the same business processes that the foreign keys encode. Try recreating some of the analytical queries from the MySQL tutorial using DataJoint’s restriction (&), projection (.proj()), and aggregation (.aggr()) operators to see how workflow-aware queries feel in a classic OLTP dataset.