Chapter 2: SELECT & Filtering

Retrieving and Filtering Data

2.1 The SELECT Statement

Every SQL query starts with SELECT. You specify which columns to retrieve, computed expressions, and the table to read from.

-- Select specific columns
SELECT product_name, unit_price
FROM products;

-- Arithmetic in SELECT
SELECT
    product_name,
    unit_price,
    unit_price * 1.08 AS price_with_tax
FROM products;

-- String concatenation
SELECT
    first_name || ' ' || last_name AS full_name
FROM customers;

2.2 WHERE Clause

WHERE filters rows before they appear in the result set. Only rows that satisfy the condition are returned.

-- Exact match
SELECT * FROM products
WHERE category = 'Electronics';

-- Numeric comparison
SELECT product_name, unit_price
FROM products
WHERE unit_price > 50;

-- Not equal (two equivalent syntaxes)
SELECT * FROM products
WHERE category <> 'Accessories';

SELECT * FROM products
WHERE category != 'Accessories';  -- also valid

2.3 Logical Operators

-- AND: both conditions must be true
SELECT * FROM products
WHERE category = 'Electronics' AND unit_price < 100;

-- OR: at least one condition must be true
SELECT * FROM products
WHERE category = 'Electronics' OR category = 'Hardware';

-- NOT: negates a condition
SELECT * FROM products
WHERE NOT category = 'Accessories';
Operator precedence: AND is evaluated before OR. Use parentheses to make your intent explicit. WHERE a = 1 OR b = 2 AND c = 3 means WHERE a = 1 OR (b = 2 AND c = 3). If you want (a = 1 OR b = 2) AND c = 3, you must write the parentheses.

2.4 Pattern Matching with LIKE

-- % matches zero or more characters
SELECT * FROM products
WHERE product_name LIKE 'Widget%';  -- starts with 'Widget'

SELECT * FROM customers
WHERE email LIKE '%@gmail.com';  -- ends with @gmail.com

SELECT * FROM products
WHERE product_name LIKE '%Pro%';  -- contains 'Pro' anywhere

-- _ matches exactly one character
SELECT * FROM products
WHERE product_name LIKE 'Widget _';  -- 'Widget A', 'Widget B', etc.

-- PostgreSQL: ILIKE for case-insensitive matching
-- SELECT * FROM products WHERE product_name ILIKE '%widget%';

2.5 Range and Set Operators

-- BETWEEN: inclusive range
SELECT * FROM products
WHERE unit_price BETWEEN 10 AND 100;

-- Equivalent to:
SELECT * FROM products
WHERE unit_price >= 10 AND unit_price <= 100;

-- IN: match any value in a list
SELECT * FROM products
WHERE category IN ('Electronics', 'Hardware');

-- NOT IN: exclude values
SELECT * FROM products
WHERE category NOT IN ('Accessories');

-- Date ranges with BETWEEN
SELECT * FROM orders
WHERE order_date BETWEEN '2024-01-01' AND '2024-03-31';

2.6 NULL Handling

NULL means "unknown" or "missing." It is not the same as zero or an empty string. Any comparison with NULL returns NULL (not TRUE or FALSE), so you must use IS NULL / IS NOT NULL.

-- Find rows with missing values
SELECT * FROM customers
WHERE email IS NULL;

-- Find rows WITH values
SELECT * FROM customers
WHERE email IS NOT NULL;

-- COALESCE: return the first non-NULL value
SELECT
    product_name,
    COALESCE(category, 'Uncategorized') AS category
FROM products;

-- NULLIF: return NULL if two values are equal
SELECT
    product_name,
    NULLIF(in_stock, 0) AS stock_or_null
FROM products;
-- Useful to avoid division by zero: revenue / NULLIF(units, 0)
The NULL trap: WHERE column = NULL never returns any rows because NULL = NULL evaluates to NULL, not TRUE. Always use IS NULL. Similarly, NOT IN with a list containing NULL returns no rows. Use NOT EXISTS instead when NULLs may be present.

2.7 Sorting with ORDER BY

-- Ascending order (default)
SELECT * FROM products
ORDER BY unit_price ASC;

-- Descending order
SELECT * FROM products
ORDER BY unit_price DESC;

-- Multi-column sort: first by category, then by price within each category
SELECT * FROM products
ORDER BY category ASC, unit_price DESC;

-- Sort by a computed expression
SELECT
    product_name,
    unit_price * in_stock AS inventory_value
FROM products
ORDER BY inventory_value DESC;

2.8 LIMIT and OFFSET

-- Return only the first 5 rows
SELECT * FROM products
ORDER BY unit_price DESC
LIMIT 5;

-- Skip the first 5 rows, then return the next 5 (pagination)
SELECT * FROM products
ORDER BY product_name
LIMIT 5 OFFSET 5;

-- Top-N query: most expensive product
SELECT product_name, unit_price
FROM products
ORDER BY unit_price DESC
LIMIT 1;
LIMIT without ORDER BY is unpredictable. Without ORDER BY, the database returns rows in whatever order is convenient (often insertion order, but not guaranteed). Always pair LIMIT with ORDER BY so results are deterministic.

2.9 DISTINCT

-- Unique categories in the products table
SELECT DISTINCT category FROM products;

-- Unique combinations of city and state
SELECT DISTINCT city, state FROM customers;

-- Count unique values
SELECT COUNT(DISTINCT category) AS num_categories
FROM products;

2.10 CASE Expressions

CASE is SQL's if-else construct. It lets you create conditional logic directly in your query, which is invaluable for categorizing data on the fly.

-- Simple CASE: categorize products by price tier
SELECT
    product_name,
    unit_price,
    CASE
        WHEN unit_price < 20  THEN 'Budget'
        WHEN unit_price < 100 THEN 'Mid-range'
        ELSE 'Premium'
    END AS price_tier
FROM products;

-- CASE in ORDER BY (custom sort order)
SELECT * FROM orders
ORDER BY
    CASE status
        WHEN 'Urgent'    THEN 1
        WHEN 'Normal'    THEN 2
        WHEN 'Low'       THEN 3
        ELSE 4
    END;

-- CASE with aggregation (pivot-style counting)
SELECT
    COUNT(CASE WHEN category = 'Electronics' THEN 1 END) AS electronics,
    COUNT(CASE WHEN category = 'Hardware' THEN 1 END) AS hardware,
    COUNT(CASE WHEN category = 'Accessories' THEN 1 END) AS accessories
FROM products;

Exercise 2.1

Write a query that returns all products priced between $20 and $100 that are in the 'Electronics' or 'Hardware' category, sorted by price descending. Use BETWEEN and IN.

Exercise 2.2

Write a query that returns customers whose email is missing (NULL) or whose last name starts with 'S'. Use IS NULL and LIKE with appropriate logical operators.

Exercise 2.3

Using the CASE expression, create a query that labels each order as 'Small' (total < 50), 'Medium' (50-200), or 'Large' (> 200). Return the order_id, total, and the size label. Sort by total descending and show only the top 10.

Exercise 2.4

Write a query that returns the 3 cheapest products that are NOT in the 'Accessories' category and have stock greater than zero. Use NOT IN, comparison operators, ORDER BY, and LIMIT.

Official Resources

Chapter 2 Takeaways

← Chapter 1: Getting Started Chapter 3: Joins →