Variables, Collections, Loops, and Conditionals
Python distinguishes integers (whole numbers) from floating-point numbers (decimals). Arithmetic follows standard precedence, and division with / always returns a float.
units = 500 # int price = 24.99 # float revenue = units * price # 12495.0 (float) # Integer division and modulo boxes = units // 12 # 41 (floor division) leftover = units % 12 # 8 (remainder) print(f"{boxes} full boxes, {leftover} loose units")
Strings are sequences of characters. They support indexing, slicing, and a rich set of methods for cleaning text data.
sku = " WDG-2024-A " # Common string methods print(sku.strip()) # "WDG-2024-A" print(sku.strip().lower()) # "wdg-2024-a" print(sku.strip().split("-")) # ['WDG', '2024', 'A'] # Slicing: string[start:stop:step] code = "ABCDEFGH" print(code[0:3]) # "ABC" print(code[::2]) # "ACEG"
Strings are immutable in Python, so every method returns a new string rather than modifying the original. This table covers the methods you will use most often when cleaning real-world data.
raw = " Hello, World! " # Whitespace removal print(raw.strip()) # "Hello, World!" — both sides print(raw.lstrip()) # "Hello, World! " — left only print(raw.rstrip()) # " Hello, World!" — right only # Search and replace msg = "Order #1234 shipped" print(msg.replace("shipped", "delivered")) # "Order #1234 delivered" print(msg.find("#")) # 6 (index of first match) print(msg.startswith("Order")) # True print(msg.endswith("shipped")) # True # Split and join — critical for parsing data csv_line = "Widget,29.99,500,East" fields = csv_line.split(",") # ['Widget', '29.99', '500', 'East'] reconstructed = " | ".join(fields) # "Widget | 29.99 | 500 | East" # Case conversion print("hello".upper()) # "HELLO" print("HELLO".lower()) # "hello" print("hello world".title()) # "Hello World"
f-strings (formatted string literals, introduced in Python 3.6) are the most readable way to embed expressions inside strings. The format specification mini-language gives you fine control over alignment, padding, and number formatting.
product = "Widget" price = 1234.5 pct = 0.0873 # Number formatting print(f"{price:,.2f}") # "1,234.50" — commas + 2 decimals print(f"{pct:.1%}") # "8.7%" — percentage format print(f"{price:.0f}") # "1234" — no decimals # Alignment and padding print(f"{product:<15}${price:>10,.2f}") # "Widget $1,234.50" print(f"{product:*^20}") # "*******Widget*******" # Expressions inside f-strings print(f"Tax: ${price * 0.08:,.2f}") # "Tax: $98.76"
Boolean values (True/False) are the backbone of conditional logic. Comparison operators return booleans.
on_time = True lead_time = 7 print(lead_time > 5) # True print(lead_time == 7 and on_time) # True print(not on_time) # False
Lists are ordered, mutable collections. They are the workhorse data structure for storing sequences of items.
# Create and modify lists warehouses = ["Newark", "Chicago", "Dallas"] warehouses.append("Seattle") warehouses.insert(1, "Atlanta") print(warehouses) # ['Newark', 'Atlanta', 'Chicago', 'Dallas', 'Seattle'] print(len(warehouses)) # 5 print(warehouses[-1]) # "Seattle" (last element) # Slicing returns a new list first_three = warehouses[:3] print(first_three) # ['Newark', 'Atlanta', 'Chicago']
Dictionaries store key-value pairs. They are ideal for lookup tables, configuration settings, and mapping IDs to records.
# Product catalog as a dictionary product = { "sku": "WDG-2024", "name": "Widget A", "price": 29.99, "stock": 1250 } # Access and update print(product["name"]) # "Widget A" product["stock"] -= 100 # Sell 100 units product["category"] = "Hardware" # Add new key # Safe access with .get() weight = product.get("weight", "N/A") # "N/A" (key missing)
Tuples are immutable sequences. Use them for fixed collections like coordinates, database records, or function return values.
# Warehouse coordinates (lat, lon) location = (40.7128, -74.0060) lat, lon = location # Tuple unpacking print(f"Latitude: {lat}, Longitude: {lon}")
A set is an unordered collection of unique elements. Sets are useful when you need to remove duplicates, test membership quickly, or compute intersections and unions. The frozenset is an immutable version that can be used as a dictionary key or stored in another set.
# Create sets east_skus = {"A101", "B202", "C303", "A101"} # duplicate removed west_skus = {"B202", "D404", "E505"} print(east_skus) # {'A101', 'B202', 'C303'} # Set operations print(east_skus & west_skus) # {'B202'} — intersection (sold in both) print(east_skus | west_skus) # all unique SKUs — union print(east_skus - west_skus) # {'A101', 'C303'} — East only # Membership test (O(1) average, much faster than lists) print("A101" in east_skus) # True # frozenset — immutable, can be a dict key region_key = frozenset(["East", "Central"]) coverage = {region_key: 0.92} # Remove duplicates from a list raw_ids = [1, 2, 2, 3, 3, 3] unique_ids = list(set(raw_ids)) # [1, 2, 3]
service_level = 0.92 if service_level >= 0.95: rating = "Excellent" elif service_level >= 0.90: rating = "Good" elif service_level >= 0.80: rating = "Acceptable" else: rating = "Below Standard" print(f"Service level {service_level:.0%} — Rating: {rating}")
Use for to iterate over a known collection and while when you need to loop until a condition changes.
# For loop with enumerate products = ["Widget", "Gadget", "Bracket"] for i, name in enumerate(products, start=1): print(f"{i}. {name}") # While loop: reorder point check inventory = 200 daily_demand = 15 day = 0 while inventory > 50: inventory -= daily_demand day += 1 print(f"Reorder needed on day {day} (stock: {inventory})")
Two built-in functions that make loops cleaner and eliminate the need for manual index tracking. enumerate() gives you both the index and the value. zip() pairs elements from two or more iterables together.
# enumerate — avoids manual counter variables skus = ["A101", "B202", "C303"] for idx, sku in enumerate(skus): print(f"Row {idx}: {sku}") # zip — iterate over multiple lists in parallel names = ["Widget", "Gadget", "Bracket"] prices = [29.99, 14.50, 7.25] stocks = [500, 320, 750] for name, price, stock in zip(names, prices, stocks): print(f"{name:10} ${price:6.2f} stock: {stock}") # Combine enumerate + zip for index + paired values for i, (name, price) in enumerate(zip(names, prices), 1): print(f"{i}. {name}: ${price}")
zip, you would iterate using index variables like for i in range(len(names)) and then access names[i], prices[i], etc. This is error-prone and hard to read. zip produces cleaner code and catches length mismatches early (it stops at the shortest iterable).
Real-world data often involves structures nested inside other structures: lists of dictionaries (like rows from a database), dictionaries of lists (like column-oriented data), or dictionaries containing other dictionaries. Understanding how to navigate and build these is essential for working with JSON data, API responses, and configuration files.
# List of dictionaries — each dict is a record (row) orders = [ {"id": 1001, "product": "Widget", "qty": 50, "region": "East"}, {"id": 1002, "product": "Gadget", "qty": 120, "region": "West"}, {"id": 1003, "product": "Widget", "qty": 30, "region": "East"}, ] # Access nested data print(orders[0]["product"]) # "Widget" # Loop through records total_qty = sum(order["qty"] for order in orders) print(f"Total units ordered: {total_qty}") # Dictionary of lists — column-oriented layout columns = { "month": ["Jan", "Feb", "Mar"], "revenue": [45000, 52000, 48000], "cost": [30000, 33000, 31000] } # Access a column print(columns["revenue"]) # [45000, 52000, 48000]
A concise way to create lists by combining a loop and an optional condition in a single line. Dictionary comprehensions follow the same pattern but produce key-value pairs.
prices = [12.5, 8.0, 25.0, 3.5, 19.99, 45.0] # List comprehension: filter and transform premium = [p for p in prices if p > 15] print(premium) # [25.0, 19.99, 45.0] # Apply 10% discount discounted = [p * 0.9 for p in prices] print(discounted) # Dictionary comprehension — create a lookup table inventory = {"A": 120, "B": 45, "C": 200, "D": 30} low_stock = {sku: qty for sku, qty in inventory.items() if qty < 50} print(low_stock) # {'B': 45, 'D': 30} # Invert a dictionary (swap keys and values) region_codes = {"East": "E", "West": "W", "Central": "C"} code_to_region = {v: k for k, v in region_codes.items()} print(code_to_region) # {'E': 'East', 'W': 'West', 'C': 'Central'} # Set comprehension categories = {order["region"] for order in orders} print(categories) # {'East', 'West'}
if clause or spans more than about 80 characters, rewrite it as a regular for-loop. Comprehensions are meant to simplify simple patterns, not to compress complex logic into a single unreadable line.
When your code encounters an error at runtime (like dividing by zero or converting an invalid string to a number), Python raises an exception and stops. The try/except block lets you catch these errors, handle them gracefully, and keep your program running. This is especially important when processing real-world data that may contain unexpected values.
# Basic try/except try: price = float("not_a_number") except ValueError: print("Could not convert to float — check your data") # Handling multiple error types def safe_divide(a, b): try: return a / b except ZeroDivisionError: print("Cannot divide by zero") return None except TypeError: print("Both arguments must be numbers") return None print(safe_divide(10, 0)) # Cannot divide by zero → None print(safe_divide(10, "a")) # Both arguments must be numbers → None print(safe_divide(10, 3)) # 3.333... # try/except/else/finally — the full pattern try: f = open("data.csv") except FileNotFoundError: print("File not found — check the path") else: print(f"Opened file with {len(f.readlines())} lines") f.close() finally: print("This runs no matter what")
except: without specifying the error type hides bugs and makes debugging very difficult. Always catch the specific exception types you expect (ValueError, KeyError, FileNotFoundError, etc.).
Given the dictionary inventory = {"A": 120, "B": 45, "C": 200, "D": 30, "E": 95}, write a loop that prints each SKU that has stock below 50 units. Then rewrite it as a dictionary comprehension that returns only the low-stock SKUs with their quantities.
Create a list of monthly sales figures for 12 months. Use a for loop to compute the running cumulative total and store each month's cumulative value in a new list. Print both lists side by side.
Given two lists, products = ["Widget", "Gadget", "Bracket"] and prices = [29.99, 14.50, 7.25], use zip() to create a dictionary mapping product names to prices. Then use a dictionary comprehension to create a new dictionary containing only products that cost more than $10.
Write a function that takes a list of strings representing prices (e.g., ["29.99", "N/A", "14.50", "", "7.25"]) and returns a list of floats, replacing any invalid entries with 0.0. Use try/except inside your loop to handle conversion errors gracefully.
split(), join(), strip(), and replace() are essential for cleaning real-world data.enumerate() and zip() eliminate manual index tracking and produce cleaner loops.try/except to handle errors gracefully; always catch specific exception types.