Skip to article frontmatterSkip to article content

Homework 7: Aggregations

import pymysql
pymysql.install_as_MySQLdb()
%load_ext sql
%config SqlMagic.autocommit=True
%sql mysql://root:simple@127.0.0.1
import datajoint as dj

sales = dj.Schema('classicsales')
sales.spawn_missing_classes()

nations = dj.Schema('nation')
nations.spawn_missing_classes()

hotel = dj.Schema('hotel')
hotel.spawn_missing_classes()

university = dj.Schema('university')
university.spawn_missing_classes()

app = dj.Schema('app')
app.spawn_missing_classes()
[2023-10-31 23:46:29,692][INFO]: Connecting root@fakeservices.datajoint.io:3306
[2023-10-31 23:46:29,738][INFO]: Connected root@fakeservices.datajoint.io:3306
dj.Diagram(sales) + dj.Diagram(hotel) + dj.Diagram(university) + dj.Diagram(app) + dj.Diagram(nations)
Loading...

Homework 7: Aggregations

Problem 1 (sales): Show all the offices along with the number of employees in each.

Problem 2 (sales): Show all employees, including the number of direct reports they have.

Problem 3 (sales): Show the top biggests orders in the current month along with the total amount on the order.

Problem 4 (sales): Show the top 5 customers by the amount of money that they have spent this month, including the amount.

Problem 5 (app): For each addon, show how many people have bought them.

Problem 6 (sales): Show the top 5 employees by the sales they have made so far this year.

Problem 7 (nations): Show the top 5 languages by the number of countries in which they are the official language. Include the number of countries.

Problem 8 (nations): Show the world population and and gdp for 2018.

Problem 9 (nations): Show the world population and GDP for each year.

Problem 10 (nations): Show all continents, along with their populations and GDP in 2018.

Problem 11 (nations): Show all the countries in Africa with a population greater than 100,000,000 in 2018.

Problem 12 (university): Show the total number of students who have taken a math class.

Problem 13 (university): Show the top course by enrollment in the current term.

Problem 14 (hotel): Show the top five guests by the number of nights that they have stayed a the hotel.

Problem 15 (nations): Show all the regions and the average GDP per capita in each for 2018.

Problem 16 (Sales): Show the top five products by total sales (in dollars)

Problem 17 (app): Show the total sales by day over the last month.

Problem 18 (university): Show all the departments and the number of students electing them for their majors.

Problem 19 (university): Show all departments and the number of courses they offer in the current semester.

Problem 20 (university) Show what fraction of students declared “MATH” as their major each year.

Problem 21 (university) Show all courses offered in the current term with at least five students enrolled.

Solutions

Problem 15 - Show the world regions and the GDP per capita in each for 2018

stats2018 = Countries.proj(..., country_name="name") * CountryStats() & 'year=2018'

Regions.aggr(stats2018 , 'name', gdp_per_capita = 'sum(gdp) / sum(population)') 
Loading...
%%sql
-- Show the world's GDP per capita for 2018
use nation;

SELECT sum(gdp) / sum(population) as gdp_per_capital FROM country_stats WHERE year=2018
Loading...
%%sql
use nation; 
-- Show the world's GDP per capita for 2018 by region
SELECT regions.name, sum(gdp) / sum(population) as gdp_per_capita 
FROM regions JOIN countries using (region_id) NATURAL JOIN country_stats 
WHERE year=2018 GROUP BY region_id
ORDER BY gdp_per_capita DESC
Loading...
%%sql
-- show all the regions with GDP per capita over 25,000 in 2018
use nation;

SELECT regions.name, sum(gdp) / sum(population) as gdp_per_capita 
FROM regions JOIN countries using (region_id) NATURAL JOIN country_stats 
WHERE year=2018 
GROUP BY region_id
HAVING gdp_per_capita > 25000
ORDER BY gdp_per_capita DESC
Loading...
%%sql

SELECT * FROM (
    SELECT regions.name, sum(gdp) / sum(population) as gdp_per_capita 
    FROM regions JOIN countries using (region_id) NATURAL JOIN country_stats 
    WHERE year=2018 
    GROUP BY region_id
    ORDER BY gdp_per_capita DESC) as q
WHERE gdp_per_capita > 25000
Loading...