Chapter 1: Getting Started

Installation, Tools, and Your First Queries

1.1 Why SQL for Analytics?

SQL (Structured Query Language) is the universal language for working with relational databases. Whether you are pulling sales reports, analyzing customer behavior, or building dashboards, SQL is the tool that sits between you and the data.

Almost every company stores its operational data in a relational database. Orders, customers, products, inventory levels, employee records: all of these live in tables made up of rows and columns. SQL lets you ask precise questions of that data without writing a full program.

The relational model organizes data into tables (also called relations). Each table has a fixed set of columns (attributes) and a variable number of rows (records). A column in one table can reference a column in another through a foreign key, linking the tables together. This structure eliminates redundancy and keeps data consistent.

Declarative vs. Procedural: SQL is declarative: you describe what data you want, not how to retrieve it. The database engine decides the most efficient execution plan. This is different from procedural languages like Python, where you write step-by-step instructions.

1.2 Installing SQLite

SQLite is the simplest way to start. It requires no server, no configuration, and runs as a single file on your computer. Download it from sqlite.org.

macOS: SQLite comes pre-installed. Open Terminal and type:

-- Check your SQLite version
sqlite3 --version

Windows: Download the "sqlite-tools" zip, extract it, and add the folder to your system PATH. Then open Command Prompt:

sqlite3 my_first_database.db

You should see the SQLite prompt: sqlite>. Type .quit to exit.

PostgreSQL (for Advanced Use)

For production-grade work, PostgreSQL is the most capable open-source database. Install via postgresql.org/download or use Homebrew on macOS: brew install postgresql@16.

1.3 Database Tools

ToolBest ForKey Feature
DB Browser for SQLiteBeginners, SQLite filesVisual table browser, no setup
DBeaverMulti-database workSupports 80+ databases, free Community edition
pgAdminPostgreSQLOfficial PostgreSQL admin tool
DataGripProfessional developmentIntelliJ-based, intelligent autocomplete
VS Code + SQLite extLightweight editingRuns queries inside your code editor
Recommendation: Start with DB Browser for SQLite for visual exploration. Move to DBeaver when you need to connect to PostgreSQL or MySQL. Both are free and cross-platform.

1.4 Creating Your First Database

A database starts with tables. The CREATE TABLE statement defines a table's name and columns. Each column has a name and a data type.

CREATE TABLE customers (
    customer_id   INTEGER PRIMARY KEY,
    first_name    TEXT NOT NULL,
    last_name     TEXT NOT NULL,
    email         TEXT,
    signup_date   TEXT    -- SQLite stores dates as TEXT
);

CREATE TABLE products (
    product_id    INTEGER PRIMARY KEY,
    product_name  TEXT NOT NULL,
    category      TEXT,
    unit_price    REAL,
    in_stock      INTEGER DEFAULT 0
);

CREATE TABLE orders (
    order_id      INTEGER PRIMARY KEY,
    customer_id   INTEGER REFERENCES customers(customer_id),
    order_date    TEXT,
    total         REAL
);
Common data types: INTEGER for whole numbers (IDs, quantities). TEXT for variable-length strings (names, emails). REAL for floating-point numbers (prices, measurements). In PostgreSQL you also have DATE, TIMESTAMP, NUMERIC, and BOOLEAN.

1.5 Inserting Data

Use INSERT INTO to add rows to a table. You specify the table, the columns, and the values.

INSERT INTO customers (customer_id, first_name, last_name, email, signup_date)
VALUES
    (1, 'Alice',  'Chen',    'alice@example.com',   '2024-01-15'),
    (2, 'Bob',    'Martinez','bob.m@example.com',   '2024-02-20'),
    (3, 'Carol',  'Johnson', 'carol.j@example.com', '2024-03-05'),
    (4, 'David',  'Kim',     NULL,                    '2024-03-18'),
    (5, 'Emma',   'Wilson',  'emma.w@example.com',  '2024-04-01');

INSERT INTO products (product_id, product_name, category, unit_price, in_stock)
VALUES
    (101, 'Wireless Mouse',     'Electronics', 29.99,  150),
    (102, 'Mechanical Keyboard', 'Electronics', 89.99,  75),
    (103, 'Notebook (A5)',       'Stationery',  4.50,   500),
    (104, 'Standing Desk',       'Furniture',   349.00, 20),
    (105, 'Monitor 27-inch',     'Electronics', 299.00, 45);

INSERT INTO orders (order_id, customer_id, order_date, total)
VALUES
    (1001, 1, '2024-03-10', 119.98),
    (1002, 2, '2024-03-12', 29.99),
    (1003, 1, '2024-04-01', 349.00),
    (1004, 3, '2024-04-05', 13.50),
    (1005, 5, '2024-04-10', 388.99);

1.6 Your First SELECT

The SELECT statement retrieves data. The simplest form selects all columns from a table.

-- Select all columns and all rows
SELECT * FROM customers;

-- Select specific columns
SELECT first_name, last_name, email
FROM customers;

-- Use aliases to rename columns in output
SELECT
    product_name        AS product,
    unit_price          AS price,
    unit_price * 1.08  AS price_with_tax,
    in_stock            AS inventory
FROM products;

-- Computed column: inventory value
SELECT
    product_name,
    unit_price,
    in_stock,
    unit_price * in_stock AS inventory_value
FROM products;
SELECT * vs. named columns: Use SELECT * for quick exploration. In production queries and scripts, always name the columns you need. This makes queries faster and protects your code from breaking when someone adds or renames a column.

1.7 SQL Dialects

SQL has an ISO standard, but every database adds its own extensions. The core statements (SELECT, INSERT, UPDATE, DELETE) work the same way everywhere. Differences show up in data types, functions, and advanced features.

FeatureSQLitePostgreSQLMySQL
String concatenation||||CONCAT()
Current dateDATE('now')CURRENT_DATECURDATE()
Auto-increment PKINTEGER PRIMARY KEYSERIAL / GENERATEDAUTO_INCREMENT
Case-insensitive LIKEDefault behaviorUse ILIKEDefault behavior
Boolean typeINTEGER (0/1)Native BOOLEANTINYINT(1)
Portability pitfalls: If you write queries in SQLite and later move to PostgreSQL, watch out for: (1) date/time functions are completely different, (2) SQLite has dynamic typing while PostgreSQL enforces types strictly, and (3) SQLite's GROUP BY allows non-aggregated columns that other databases reject.

1.8 Comments and Formatting

-- Single-line comment: everything after -- is ignored
SELECT first_name, last_name
FROM customers;

/*
   Multi-line comment:
   This query calculates inventory value
   for products in the Electronics category.
*/
SELECT
    product_name,
    unit_price * in_stock AS inventory_value
FROM products
WHERE category = 'Electronics';
Formatting conventions: Write SQL keywords in UPPERCASE. Write table and column names in lowercase with underscores. Put each major clause on its own line. Indent column lists. End every statement with a semicolon. These are conventions, not rules, but they make queries far easier to read.

1.9 Common Beginner Errors

-- 1. Missing semicolon (query hangs in many tools)
SELECT * FROM customers   -- no semicolon: won't execute
SELECT * FROM customers;  -- fixed

-- 2. Quoting: single quotes for values, double for identifiers
SELECT * FROM products WHERE category = 'Electronics';  -- correct
-- WHERE category = "Electronics"  -- works in SQLite, fails in PostgreSQL

-- 3. Case sensitivity in string comparisons
SELECT * FROM customers WHERE first_name = 'alice';  -- no match
SELECT * FROM customers WHERE first_name = 'Alice';  -- match

-- 4. Missing comma between columns (treated as alias)
SELECT product_name unit_price FROM products;  -- bug: alias
SELECT product_name, unit_price FROM products; -- fixed
Debugging strategy: When a query fails: (1) check for missing commas between column names, (2) check for missing or extra parentheses, (3) run just SELECT * FROM table to make sure the table exists and has the columns you expect. Read error messages carefully; look for the word "near" followed by a quoted token.

1.10 The SQLite Dot Commands

SQLite's command-line shell has special commands prefixed with a dot. These are not SQL statements; they are shell-specific utilities.

-- List all tables in the database
.tables

-- Show the CREATE statement for a table
.schema products

-- Toggle column headers and set readable output
.headers on
.mode column

-- Export query results to CSV
.mode csv
.output results.csv
SELECT * FROM products;
.output stdout

-- Import a CSV file into a table
.import data.csv products

-- Exit the shell
.quit

Exercise 1.1

Create a table called employees with columns: emp_id (INTEGER PRIMARY KEY), full_name (TEXT NOT NULL), department (TEXT), hire_date (TEXT), and salary (REAL). Insert at least 5 employees and write a SELECT query that shows all employees sorted by salary descending.

Exercise 1.2

Write a query against the products table that shows each product's name, unit price, a 15% discount price (labeled sale_price), and the savings amount. Which product has the largest savings in dollar terms?

Exercise 1.3

Create a customers table and insert 5 customers. Write a query that returns their full name (first + last) as a single column using the || concatenation operator.

Exercise 1.4

The following query has three errors. Identify and fix all of them:
SELECT product_name unit_price, in_stock FROM products WHERE category = Electronics

Hint: missing comma, missing quotes, missing semicolon.

Official Resources

Chapter 1 Takeaways

← Guide Home Chapter 2: SELECT & Filtering →