Chapter 3: Functions & Modules

Defining, Importing, and Packaging Code

3.1 Defining Functions with def

Functions let you wrap reusable logic behind a name. They take inputs (parameters), do work, and return outputs. Writing functions makes your analysis reproducible and your code easier to test.

def eoq(demand, order_cost, holding_cost):
    """Calculate Economic Order Quantity."""
    return (2 * demand * order_cost / holding_cost) ** 0.5

# Call the function
q_star = eoq(10000, 50, 2)
print(f"Optimal order quantity: {q_star:.0f} units")  # 707 units

3.2 Default Arguments and Keyword Arguments

Default values make parameters optional. Keyword arguments improve readability when calling functions with many parameters.

def shipping_cost(weight, rate_per_kg=2.50, surcharge=0):
    """Calculate shipping cost with optional surcharge."""
    return weight * rate_per_kg + surcharge

# Positional
print(shipping_cost(10))                     # $25.00

# Keyword arguments (order doesn't matter)
print(shipping_cost(weight=10, surcharge=5))   # $30.00

3.3 *args and **kwargs

Use *args to accept any number of positional arguments and **kwargs for any number of keyword arguments.

def total_revenue(*sales):
    """Sum any number of sales figures."""
    return sum(sales)

print(total_revenue(1200, 3400, 2800))  # 7400

def build_order(**items):
    """Print order details from keyword arguments."""
    for product, qty in items.items():
        print(f"  {product}: {qty} units")

build_order(widgets=50, brackets=120, bolts=500)

3.4 Lambda Functions

Lambda functions are anonymous, single-expression functions. They are useful as short callbacks for sorting and filtering.

products = [
    {"name": "Widget", "price": 29.99},
    {"name": "Gadget", "price": 14.50},
    {"name": "Bracket", "price": 7.25},
]

# Sort by price descending
products.sort(key=lambda p: p["price"], reverse=True)
for p in products:
    print(f"{p['name']:10} ${p['price']:.2f}")

3.5 Return Values

Functions can return multiple values as a tuple. This is common when a calculation produces several related outputs.

def inventory_stats(levels):
    """Return mean, min, and max of inventory levels."""
    avg = sum(levels) / len(levels)
    return avg, min(levels), max(levels)

daily_stock = [120, 95, 110, 88, 102, 130, 75]
mean_val, min_val, max_val = inventory_stats(daily_stock)
print(f"Avg: {mean_val:.1f}, Min: {min_val}, Max: {max_val}")

3.6 Importing Modules

Python's standard library and third-party packages extend the language. Use import to bring them into your script.

# Standard library
import math
import os
from datetime import datetime, timedelta

# Common alias conventions
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

# Using a module
today = datetime.now()
lead_date = today + timedelta(days=14)
print(f"Order due by: {lead_date:%Y-%m-%d}")

3.7 Installing Packages with pip

# Install a single package
pip install openpyxl

# Install from a requirements file
pip install -r requirements.txt

# Check installed packages
pip list

3.8 Type Hints

Type hints let you annotate function signatures with expected types. They do not enforce types at runtime; instead, they serve as documentation and enable static analysis tools (like mypy) to catch bugs before your code runs. Adding type hints to your analytics functions makes them easier for collaborators to understand.

def eoq(demand: float, order_cost: float, holding_cost: float) -> float:
    """Calculate Economic Order Quantity."""
    return (2 * demand * order_cost / holding_cost) ** 0.5

def classify_product(price: float) -> str:
    """Return 'Premium' or 'Standard' based on price threshold."""
    return "Premium" if price > 20 else "Standard"

# Common type hint patterns
from typing import List, Dict, Optional, Tuple

def top_products(
    catalog: List[Dict[str, float]],
    n: int = 5
) -> List[str]:
    """Return names of top-n products by revenue."""
    sorted_items = sorted(catalog, key=lambda x: x["revenue"], reverse=True)
    return [item["name"] for item in sorted_items[:n]]

def find_sku(code: str) -> Optional[Dict]:
    """Return product dict or None if not found."""
    ...  # Optional means the function can return None
When to add type hints: Always add them to function signatures in shared code. For quick one-off analysis scripts, they are optional but still helpful. In Python 3.10+, you can use the simpler syntax list[str] instead of List[str] and str | None instead of Optional[str].

3.9 Docstrings and Documentation

A docstring is a string literal placed as the first statement in a function, class, or module. It becomes the function's __doc__ attribute and is displayed by help(). Following a consistent docstring convention makes your code self-documenting.

def safety_stock(z_score: float, std_demand: float, lead_time: float) -> float:
    """
    Calculate safety stock for a given service level.

    Parameters
    ----------
    z_score : float
        Z-value for desired service level (e.g., 1.96 for 97.5%).
    std_demand : float
        Standard deviation of daily demand.
    lead_time : float
        Lead time in days.

    Returns
    -------
    float
        Safety stock quantity in units.

    Examples
    --------
    >>> safety_stock(1.96, 20, 7)
    103.72
    """
    return z_score * std_demand * lead_time ** 0.5

# Access the docstring
help(safety_stock)
print(safety_stock.__doc__)
NumPy/Google style: The example above uses the NumPy docstring convention, which is the most common in data science. Google style is also popular and uses indented sections like Args: and Returns:. Pick one convention and use it consistently across your project.

3.10 Scope: The LEGB Rule

When Python encounters a variable name, it searches for it in four scopes, in this order: Local (inside the current function), Enclosing (inside any enclosing function), Global (module level), Built-in (Python's built-in names like print and len). Understanding this rule prevents subtle bugs caused by variable shadowing.

discount = 0.10  # Global scope

def apply_discount(price):
    # 'price' is Local, 'discount' is found in Global scope
    return price * (1 - discount)

print(apply_discount(100))  # 90.0

def outer():
    multiplier = 2          # Enclosing scope
    def inner(x):
        return x * multiplier  # Found in Enclosing scope
    return inner(5)

print(outer())  # 10

# Common trap: shadowing built-ins
# list = [1, 2, 3]    # BAD — shadows the built-in list()
# print = "hello"     # BAD — shadows the built-in print()
Never shadow built-ins. If you name a variable list, dict, str, sum, or print, you overwrite Python's built-in function in your current scope. This leads to confusing errors like TypeError: 'list' object is not callable. Use names like my_list or items instead.

3.11 Decorators

A decorator is a function that wraps another function to add behavior before or after it runs, without changing the original function's code. The @decorator syntax is shorthand for func = decorator(func). Decorators are used extensively in web frameworks, testing, and performance monitoring.

import time

def timer(func):
    """Decorator that prints how long a function takes to run."""
    def wrapper(*args, **kwargs):
        start = time.time()
        result = func(*args, **kwargs)
        elapsed = time.time() - start
        print(f"{func.__name__} took {elapsed:.4f}s")
        return result
    return wrapper

@timer
def slow_sum(n):
    """Sum numbers the slow way."""
    return sum(range(n))

slow_sum(10_000_000)  # slow_sum took 0.2341s

3.12 Generators and yield

A generator function uses yield instead of return. Each time you iterate over the generator, it runs until the next yield statement, produces a value, and pauses. This is memory-efficient because the entire sequence is never stored in memory at once. Generators are ideal when processing large files or creating infinite sequences.

def monthly_demand(start, growth_rate, months):
    """Generate monthly demand with compound growth."""
    demand = start
    for month in range(1, months + 1):
        yield month, round(demand)
        demand *= (1 + growth_rate)

# Iterate — only one month in memory at a time
for month, units in monthly_demand(1000, 0.05, 6):
    print(f"Month {month}: {units} units")

# Generator expression (like a list comprehension, but lazy)
total = sum(x**2 for x in range(1_000_000))  # No list in memory
Generator vs. list comprehension: Use square brackets [...] when you need to access elements by index or iterate multiple times. Use parentheses (...) when you only need to iterate once or the data is too large to fit in memory. The rule of thumb: if the result feeds directly into sum(), max(), or another aggregation, use a generator.

3.13 Recursion

A recursive function calls itself to solve a smaller version of the same problem. Each call adds a frame to the call stack, so Python limits recursion depth to about 1000 by default. Recursion is natural for tree-structured data (file systems, org charts), but for simple numeric computations, loops are usually clearer and faster.

def factorial(n: int) -> int:
    """Compute n! recursively."""
    if n <= 1:
        return 1
    return n * factorial(n - 1)

print(factorial(5))   # 120 (5 * 4 * 3 * 2 * 1)

# Practical example: flatten a nested list
def flatten(nested):
    """Flatten arbitrarily nested lists."""
    result = []
    for item in nested:
        if isinstance(item, list):
            result.extend(flatten(item))
        else:
            result.append(item)
    return result

print(flatten([1, [2, [3, 4]], 5]))  # [1, 2, 3, 4, 5]

3.14 map, filter, and reduce

These functional programming tools apply a function to every element of an iterable. map() transforms each element, filter() keeps elements that pass a test, and reduce() accumulates a single result. In modern Python, list comprehensions often replace map() and filter(), but knowing them helps you read older code and is valuable when the transformation function already exists.

from functools import reduce

prices = [29.99, 14.50, 7.25, 45.00, 3.50]

# map — apply a function to each element
discounted = list(map(lambda p: round(p * 0.9, 2), prices))
print(discounted)  # [26.99, 13.05, 6.52, 40.5, 3.15]

# filter — keep elements where the function returns True
expensive = list(filter(lambda p: p > 10, prices))
print(expensive)   # [29.99, 14.50, 45.00]

# reduce — accumulate into a single value
total = reduce(lambda a, b: a + b, prices)
print(f"Total: ${total:.2f}")  # Total: $100.24

# Equivalent using sum() — preferred for addition
print(f"Total: ${sum(prices):.2f}")
Comprehension vs. map/filter: The list comprehension [p * 0.9 for p in prices] is generally preferred over list(map(lambda p: p * 0.9, prices)) because it is easier to read. However, when the function already exists (e.g., map(str, numbers) or map(len, strings)), using map() is often cleaner than a comprehension.

3.15 Creating Your Own Module

Any .py file is a module. Save reusable functions in a file and import them elsewhere.

# File: sc_utils.py
def eoq(demand, order_cost, holding_cost):
    return (2 * demand * order_cost / holding_cost) ** 0.5

def safety_stock(z, std_demand, lead_time):
    return z * std_demand * lead_time ** 0.5
# File: analysis.py
from sc_utils import eoq, safety_stock

q = eoq(10000, 50, 2)
ss = safety_stock(1.96, 20, 7)
print(f"EOQ: {q:.0f}, Safety stock: {ss:.0f}")

Exercise 3.1

Write a function classify_abc(items) that takes a list of (name, annual_revenue) tuples and returns three lists: A items (top 80% of cumulative revenue), B items (next 15%), and C items (remaining 5%). Test it with at least five products.

Exercise 3.2

Create a module called metrics.py with functions for fill_rate(fulfilled, requested) and otif(on_time_count, total_orders). Import them in a separate script and compute both metrics for sample data.

Exercise 3.3

Write a @timer decorator (as shown in section 3.11) and apply it to two functions: one that computes the sum of a range using a for-loop, and one that uses the built-in sum(). Compare their execution times for n = 10,000,000.

Exercise 3.4

Write a generator function fibonacci(n) that yields the first n Fibonacci numbers (1, 1, 2, 3, 5, 8, ...). Use it in a for-loop to print the first 15 Fibonacci numbers. Then compute their sum using sum(fibonacci(15)).

Official Resources

Chapter 3 Takeaways

← Chapter 2: Data Types Chapter 4: NumPy →