Combining Data from Multiple Tables
Relational databases split data across multiple tables to avoid duplication. An orders table stores order_id and customer_id, but not the customer's name. A join reconnects the tables by matching related columns, letting you answer questions that span multiple tables.
orders.customer_id references customers.customer_id. Joins use these relationships to combine rows.
INNER JOIN returns only rows where the join condition matches in both tables. If an order references a customer_id that does not exist in the customers table, that order is excluded.
-- Orders with customer details SELECT o.order_id, o.order_date, c.first_name, c.last_name, o.total FROM orders o INNER JOIN customers c ON o.customer_id = c.customer_id ORDER BY o.order_date DESC; -- Multiple join conditions SELECT * FROM shipments s INNER JOIN orders o ON s.order_id = o.order_id AND s.warehouse_id = o.warehouse_id;
o.order_id vs. c.customer_id.
LEFT JOIN returns all rows from the left table and matching rows from the right table. If there is no match, the right-side columns are filled with NULL.
-- All customers, including those with no orders SELECT c.customer_id, c.first_name, c.last_name, o.order_id, o.total FROM customers c LEFT JOIN orders o ON c.customer_id = o.customer_id; -- Find customers who have NEVER placed an order SELECT c.customer_id, c.first_name, c.last_name FROM customers c LEFT JOIN orders o ON c.customer_id = o.customer_id WHERE o.order_id IS NULL;
-- RIGHT JOIN: all rows from the right table -- (same as swapping table order in LEFT JOIN) SELECT o.order_id, p.product_name FROM order_items o RIGHT JOIN products p ON o.product_id = p.product_id; -- Shows all products, including unsold ones -- FULL OUTER JOIN: all rows from both tables -- (Not supported in SQLite; use UNION of LEFT JOINs instead) SELECT c.customer_id, c.first_name, o.order_id FROM customers c FULL OUTER JOIN orders o ON c.customer_id = o.customer_id; -- SQLite workaround for FULL OUTER JOIN: SELECT c.customer_id, c.first_name, o.order_id FROM customers c LEFT JOIN orders o ON c.customer_id = o.customer_id UNION SELECT c.customer_id, c.first_name, o.order_id FROM orders o LEFT JOIN customers c ON c.customer_id = o.customer_id WHERE c.customer_id IS NULL;
A CROSS JOIN produces the Cartesian product: every row from the left table paired with every row from the right table. Useful for generating combinations or scaffolding date/product grids.
-- Generate all product-month combinations for reporting SELECT p.product_name, m.month_start FROM products p CROSS JOIN ( SELECT '2024-01-01' AS month_start UNION ALL SELECT '2024-02-01' UNION ALL SELECT '2024-03-01' ) m ORDER BY p.product_name, m.month_start;
A self-join joins a table to itself. This is useful for hierarchical data (employee-manager relationships) or comparing rows within the same table.
-- Employee-manager hierarchy SELECT e.name AS employee, m.name AS manager FROM employees e LEFT JOIN employees m ON e.manager_id = m.emp_id; -- Find products in the same category with different prices SELECT a.product_name AS product_a, b.product_name AS product_b, a.category, a.unit_price - b.unit_price AS price_diff FROM products a INNER JOIN products b ON a.category = b.category AND a.product_id < b.product_id; -- avoid duplicates
-- Order details with customer and product info SELECT o.order_id, c.first_name || ' ' || c.last_name AS customer, p.product_name, oi.quantity, oi.quantity * p.unit_price AS line_total FROM orders o INNER JOIN customers c ON o.customer_id = c.customer_id INNER JOIN order_items oi ON o.order_id = oi.order_id INNER JOIN products p ON oi.product_id = p.product_id ORDER BY o.order_id, p.product_name;
SELECT COUNT(*) FROM ....
-- Check for row multiplication SELECT COUNT(*) AS before_join FROM orders; SELECT COUNT(*) AS after_join FROM orders o INNER JOIN order_items oi ON o.order_id = oi.order_id; -- If after_join > before_join, the join expanded rows (expected with line items)
-- UNION: combine results, remove duplicates SELECT city FROM customers UNION SELECT city FROM suppliers; -- UNION ALL: combine without removing duplicates (faster) SELECT city FROM customers UNION ALL SELECT city FROM suppliers; -- INTERSECT: rows that appear in both queries SELECT customer_id FROM orders WHERE order_date >= '2024-01-01' INTERSECT SELECT customer_id FROM orders WHERE total > 500; -- EXCEPT: rows in the first query but not the second SELECT customer_id FROM customers EXCEPT SELECT customer_id FROM orders; -- Customers who have never ordered (alternative to LEFT JOIN + IS NULL)
Write a query that returns all orders with the customer's full name and order total. Use INNER JOIN between orders and customers. Sort by order_date descending.
Write a query that finds all products that have never been ordered. Use a LEFT JOIN between products and order_items, filtering for NULL order_item_id.
Create an employee-manager report using a self-join on an employees table. Show each employee's name alongside their manager's name. Include employees who have no manager (CEO/top level) by using LEFT JOIN.
Write a query that joins orders, order_items, and products to calculate the total revenue per product. Group by product_name and sort by revenue descending. Show the top 5 products.