Analytics over Partitions
Window functions compute a value for each row based on a set of related rows (the "window"), without collapsing the result into a single row like GROUP BY does. You keep all your detail rows and add an analytical column alongside them.
-- Compare: GROUP BY collapses rows SELECT category, SUM(unit_price) AS total FROM products GROUP BY category; -- Returns one row per category -- Window function: keeps all rows, adds a column SELECT product_name, category, unit_price, SUM(unit_price) OVER (PARTITION BY category) AS category_total FROM products;
-- ROW_NUMBER: unique sequential number within each partition SELECT product_name, category, unit_price, ROW_NUMBER() OVER ( PARTITION BY category ORDER BY unit_price DESC ) AS price_rank FROM products; -- RANK: same rank for ties, skips next rank -- Prices: 100, 100, 80 → ranks: 1, 1, 3 SELECT product_name, unit_price, RANK() OVER (ORDER BY unit_price DESC) AS rnk FROM products; -- DENSE_RANK: same rank for ties, no gaps -- Prices: 100, 100, 80 → ranks: 1, 1, 2 SELECT product_name, unit_price, DENSE_RANK() OVER (ORDER BY unit_price DESC) AS dense_rnk FROM products; -- Top 3 products per category WITH ranked AS ( SELECT product_name, category, unit_price, ROW_NUMBER() OVER ( PARTITION BY category ORDER BY unit_price DESC ) AS rn FROM products ) SELECT * FROM ranked WHERE rn <= 3;
-- Divide products into 4 price quartiles SELECT product_name, unit_price, NTILE(4) OVER (ORDER BY unit_price) AS price_quartile FROM products; -- Deciles for customer spending SELECT customer_id, total_spend, NTILE(10) OVER (ORDER BY total_spend DESC) AS spend_decile FROM ( SELECT customer_id, SUM(total) AS total_spend FROM orders GROUP BY customer_id );
LAG accesses a value from a previous row; LEAD accesses a value from a following row. Essential for period-over-period comparisons.
-- Month-over-month revenue change WITH monthly AS ( SELECT STRFTIME('%Y-%m', order_date) AS month, SUM(total) AS revenue FROM orders GROUP BY month ) SELECT month, revenue, LAG(revenue) OVER (ORDER BY month) AS prev_month, revenue - LAG(revenue) OVER (ORDER BY month) AS change, ROUND( (revenue - LAG(revenue) OVER (ORDER BY month)) * 100.0 / LAG(revenue) OVER (ORDER BY month), 1) AS pct_change FROM monthly; -- LEAD: look ahead to the next row SELECT order_id, order_date, LEAD(order_date) OVER ( PARTITION BY customer_id ORDER BY order_date ) AS next_order_date FROM orders;
-- Running total of revenue SELECT order_date, total, SUM(total) OVER ( ORDER BY order_date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW ) AS running_total FROM orders; -- 3-month moving average WITH monthly AS ( SELECT STRFTIME('%Y-%m', order_date) AS month, SUM(total) AS revenue FROM orders GROUP BY month ) SELECT month, revenue, AVG(revenue) OVER ( ORDER BY month ROWS BETWEEN 2 PRECEDING AND CURRENT ROW ) AS moving_avg_3m FROM monthly;
-- Compare each product's price to the cheapest in its category SELECT product_name, category, unit_price, FIRST_VALUE(product_name) OVER ( PARTITION BY category ORDER BY unit_price ASC ) AS cheapest_product, unit_price - FIRST_VALUE(unit_price) OVER ( PARTITION BY category ORDER BY unit_price ASC ) AS price_above_min FROM products;
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING.
-- ROWS BETWEEN controls which rows the function sees -- UNBOUNDED PRECEDING = first row of partition -- N PRECEDING = N rows before current -- CURRENT ROW = current row -- N FOLLOWING = N rows after current -- UNBOUNDED FOLLOWING = last row of partition -- Running total (default frame for ordered window) SUM(total) OVER (ORDER BY date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) -- Centered 5-row moving average AVG(total) OVER (ORDER BY date ROWS BETWEEN 2 PRECEDING AND 2 FOLLOWING) -- Grand total (entire partition) SUM(total) OVER ()
-- Define a window once, reuse it (PostgreSQL and newer SQLite) SELECT product_name, category, unit_price, ROW_NUMBER() OVER w AS rn, RANK() OVER w AS rnk, SUM(unit_price) OVER w AS running_total FROM products WINDOW w AS (PARTITION BY category ORDER BY unit_price DESC);
Write a query that ranks all products by price within each category using DENSE_RANK. Return the product_name, category, unit_price, and rank.
Using LAG, write a query that shows month-over-month revenue growth as a percentage. Include the month, revenue, previous month's revenue, and the percentage change.
Write a query that computes a running total of order amounts for each customer, partitioned by customer_id and ordered by order_date.
Use the CTE + ROW_NUMBER pattern to find the single most expensive product in each category. Return the category, product_name, and unit_price.