Capstone Project Module: Applying Your SQL Skills
Congratulations on making it through the core modules! You’ve learned how to query databases, modify data, connect related information, aggregate results, design table structures, and understand key real-world considerations like security and performance.
Now it’s time to put it all together. The Capstone Project module offers several scenarios where you can apply your SQL knowledge to solve practical problems, similar to tasks you might encounter in application development or data analysis.
Purpose: To synthesize your learning and gain confidence by building something tangible using SQL.
Instructions:
- Choose one of the projects below that interests you the most.
- Ensure you have your local database environment (SQLite, PostgreSQL, MySQL with a client tool) set up, as established in Module 0.
- If the project requires a specific schema, use the provided
CREATE TABLE
statements (or the schema from the E-commerce database we’ve been using). Load any provided sample data. - Tackle the listed tasks by writing the required SQL queries.
- Test your queries thoroughly in your SQL client to ensure they produce the expected results. Analyze the output.
- Think about why the queries work and how they relate to the application scenario described.
Project Option 1: E-commerce Analytics Dashboard
Context: You are tasked with providing the SQL queries needed to populate key metrics for an internal analytics dashboard for the E-commerce store we’ve used throughout the course.
Objective: Write efficient SQL queries to calculate various performance indicators based on the existing users
, products
, categories
, orders
, and order_items
tables. (Use the schema and sample data provided in Module 0).
Tasks: Write SQL queries to find the following:
- Total Revenue: Calculate the sum of
total_amount
from all orders. - Revenue Last Month: Calculate the total revenue from orders placed in the previous calendar month. (Requires date functions based on
CURRENT_DATE
– syntax might vary, e.g., finding the first and last day of the previous month). - New Users This Month: Count how many users registered in the current calendar month.
- Average Order Value (AOV): Calculate the average
total_amount
across all orders. - Best-Selling Products (by Quantity): List the
product_id
,product_name
, and totalquantity
sold for the top 5 products, ordered by total quantity descending. (Requires joiningorder_items
andproducts
, usingSUM
,GROUP BY
,ORDER BY
,LIMIT
). - Top Spending Customers: List the
user_id
,email
, and the total sum oftotal_amount
for the top 5 customers, ordered by their total spending descending. (Requires joiningusers
andorders
, usingSUM
,GROUP BY
,ORDER BY
,LIMIT
). - Sales Revenue Per Category: List the
category_name
and the total revenue (SUM
oforder_items.quantity * order_items.price_per_unit
) generated from products within each category. (Requires joining multiple tables:categories
,products
,order_items
).
Project Option 2: Social Media Feed Generation
Context: You are building the backend for a simple social media platform. A core feature is generating a user’s feed, showing recent posts from people they follow.
Objective: Define a basic schema (or use the one provided) and write the SQL queries needed to support following users and generating a personalized feed.
Provided Schema: (Run these CREATE TABLE
statements in your database)
SQL
CREATE TABLE users (
user_id INTEGER PRIMARY KEY AUTOINCREMENT,
username VARCHAR(50) NOT NULL UNIQUE,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
CREATE TABLE posts (
post_id INTEGER PRIMARY KEY AUTOINCREMENT,
user_id INTEGER NOT NULL, -- Author of the post
content TEXT NOT NULL,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
FOREIGN KEY (user_id) REFERENCES users(user_id) ON DELETE CASCADE
);
-- Junction table for follower relationship
CREATE TABLE followers (
follower_id INTEGER NOT NULL, -- The user who is following
following_id INTEGER NOT NULL, -- The user who is being followed
PRIMARY KEY (follower_id, following_id), -- Ensures a user can't follow the same person twice
FOREIGN KEY (follower_id) REFERENCES users(user_id) ON DELETE CASCADE,
FOREIGN KEY (following_id) REFERENCES users(user_id) ON DELETE CASCADE
);
-- Optional: Add a likes table
CREATE TABLE likes (
like_id INTEGER PRIMARY KEY AUTOINCREMENT,
user_id INTEGER NOT NULL, -- User who liked the post
post_id INTEGER NOT NULL,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
UNIQUE (user_id, post_id), -- User can only like a post once
FOREIGN KEY (user_id) REFERENCES users(user_id) ON DELETE CASCADE,
FOREIGN KEY (post_id) REFERENCES posts(post_id) ON DELETE CASCADE
);
Sample Data Insertion (Run these after creating tables):
SQL
-- Insert Users
INSERT INTO users (username) VALUES ('alice'), ('bob'), ('charlie'), ('diana');
-- Insert Posts (find user_ids first if needed, assume 1=alice, 2=bob, 3=charlie, 4=diana)
INSERT INTO posts (user_id, content) VALUES
(1, 'Hello world! My first post.'),
(2, 'Thinking about SQL today.'),
(1, 'Just joined this cool platform.'),
(3, 'Coding challenges are fun.'),
(4, 'Enjoying the weekend!'),
(2, 'Inner joins vs Outer joins...'),
(1, 'My third post here.');
-- Insert Follow relationships (Alice follows Bob, Charlie; Bob follows Alice; Charlie follows Alice)
INSERT INTO followers (follower_id, following_id) VALUES
(1, 2), -- Alice follows Bob
(1, 3), -- Alice follows Charlie
(2, 1), -- Bob follows Alice
(3, 1); -- Charlie follows Alice
-- Optional: Insert Likes (Alice likes Bob's post 2; Bob likes Alice's post 1)
-- Find post_ids first if needed. Assume post_id 2 is Bob's SQL post, post_id 1 is Alice's first post.
INSERT INTO likes (user_id, post_id) VALUES (1, 2), (2, 1);
Tasks: Write SQL queries to:
- Find Following: Find the
username
of everyone that a specific user is following (e.g., find whouser_id
= 1 is following). RequiresJOIN
. - Find Followers: Find the
username
of everyone who is following a specific user (e.g., find who followsuser_id
= 1). RequiresJOIN
. - Generate Feed: This is the core task. For a specific user (e.g.,
user_id
= 1), retrieve thecontent
andcreated_at
timestamp for all posts made by users that this specific user is following. The feed should be ordered by the post creation date, newest first, and limited to a reasonable number (e.g., 20 posts). (Requires joiningposts
andfollowers
, filtering byfollower_id
,ORDER BY
,LIMIT
). - (Optional) Count Likes: For a specific post (e.g.,
post_id
= 2), count how many likes it has received. (RequiresCOUNT
onlikes
table withWHERE
). - (Optional) Posts Liked by User: Find the
content
of all posts liked by a specific user (e.g.,user_id
= 1). (Requires joininglikes
andposts
).
Project Option 3: Blog/CMS Backend Queries
Context: You are working on a Content Management System (CMS) or blog platform. You need to write the SQL queries that would power common data retrieval operations for displaying posts, authors, and categories.
Objective: Define a basic schema (or use the one provided) for a blog and write SQL queries to fetch data needed for typical blog features.
Provided Schema:
SQL
CREATE TABLE authors (
author_id INTEGER PRIMARY KEY AUTOINCREMENT,
author_name VARCHAR(150) NOT NULL,
bio TEXT
);
CREATE TABLE categories (
category_id INTEGER PRIMARY KEY AUTOINCREMENT,
category_name VARCHAR(100) NOT NULL UNIQUE
);
CREATE TABLE posts (
post_id INTEGER PRIMARY KEY AUTOINCREMENT,
author_id INTEGER NOT NULL,
title VARCHAR(255) NOT NULL,
body TEXT NOT NULL,
published_at TIMESTAMP, -- NULL if draft, set when published
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
FOREIGN KEY (author_id) REFERENCES authors(author_id) ON DELETE CASCADE
);
-- Junction table for posts belonging to multiple categories
CREATE TABLE post_categories (
post_id INTEGER NOT NULL,
category_id INTEGER NOT NULL,
PRIMARY KEY (post_id, category_id),
FOREIGN KEY (post_id) REFERENCES posts(post_id) ON DELETE CASCADE,
FOREIGN KEY (category_id) REFERENCES categories(category_id) ON DELETE CASCADE
);
Sample Data Insertion (Run these after creating tables):
SQL
-- Insert Authors
INSERT INTO authors (author_name, bio) VALUES
('Jane Doe', 'Technology blogger and enthusiast.'),
('John Smith', 'Travel writer and photographer.');
-- Insert Categories
INSERT INTO categories (category_name) VALUES ('Technology'), ('Travel'), ('SQL'), ('Web Development');
-- Insert Posts (find author_ids/category_ids if needed, assume 1=Jane, 2=John; 1=Tech, 2=Travel, 3=SQL, 4=WebDev)
-- Post 1: Tech post by Jane
INSERT INTO posts (author_id, title, body, published_at) VALUES
(1, 'The Future of AI', 'AI is rapidly evolving...', '2025-03-15 10:00:00');
INSERT INTO post_categories (post_id, category_id) VALUES (1, 1); -- Tech
-- Post 2: Travel post by John
INSERT INTO posts (author_id, title, body, published_at) VALUES
(2, 'Exploring the Alps', 'Hiking through breathtaking scenery...', '2025-03-20 14:30:00');
INSERT INTO post_categories (post_id, category_id) VALUES (2, 2); -- Travel
-- Post 3: SQL post by Jane
INSERT INTO posts (author_id, title, body, published_at) VALUES
(1, 'Mastering SQL Joins', 'Inner, Left, Right, Full Outer...', '2025-04-01 09:00:00');
INSERT INTO post_categories (post_id, category_id) VALUES (3, 1), (3, 3); -- Tech, SQL
-- Post 4: Draft post by Jane
INSERT INTO posts (author_id, title, body, published_at) VALUES
(1, 'Web Dev Trends 2025', 'Upcoming trends...', NULL); -- Draft
INSERT INTO post_categories (post_id, category_id) VALUES (4, 1), (4, 4); -- Tech, Web Development
Tasks: Write SQL queries to:
- Get Published Posts: Select the
title
,body
,published_at
, and theauthor_name
for all posts that have a non-NULLpublished_at
date, ordered bypublished_at
descending (most recent first). (RequiresJOIN
,WHERE
,ORDER BY
). - Get Posts by Category: Select the
title
andpublished_at
date for all published posts belonging to a specific category name (e.g., ‘Technology’). (Requires joiningposts
,post_categories
,categories
, usingWHERE
oncategory_name
andpublished_at
). - Get Posts by Author: Select the
title
andpublished_at
date for all published posts written by a specific author name (e.g., ‘Jane Doe’). (Requires joiningposts
andauthors
, usingWHERE
onauthor_name
andpublished_at
). - Get Post with Author and Categories: For a single specific post (e.g.,
post_id
= 3), retrieve itstitle
,body
, theauthor_name
, and a list/aggregation of allcategory_name
s it belongs to. (The category list part might requireGROUP_CONCAT
(MySQL/SQLite) orSTRING_AGG
(PostgreSQL) or simply returning multiple rows if aggregation is too complex). - (Optional) Count Posts per Author: List each
author_name
and the total number of published posts they have written. (RequiresJOIN
,WHERE
,GROUP BY
,COUNT
).
Checking Your Work & Next Steps:
- Run each query in your SQL client. Does the output look correct based on the task description and sample data?
- Are you using the appropriate clauses (
WHERE
,JOIN
,GROUP BY
,HAVING
,ORDER BY
,LIMIT
)? - Are your joins connecting tables on the correct keys?
- Could your query be written more efficiently or readably (e.g., using aliases, CTEs for complex parts)?
Completing one of these projects demonstrates a solid practical understanding of SQL. This is a significant achievement! From here, you can continue exploring the advanced topics mentioned in Module 7 or start applying your SQL skills in your own projects or work. Good luck!