Binary Classification Challenge

Build a classifier that predicts a yes/no outcome (churn, fraud, default, conversion, delay). The goal is robust generalization and business‑appropriate thresholding — not just accuracy.

Classification Calibration + Thresholds Regularization Imbalanced Data

✅ Deliverables

Problem statement: who acts on predictions and what action is taken.
Label definition: how you define positive vs negative class (and why).
Baseline: simple logistic regression baseline with clear features.
Final model: improved pipeline (regularization + feature selection + evaluation).
Decision threshold: choose a threshold using costs/benefits (not default 0.5).

📏 Recommended evaluation

Core metrics: ROC‑AUC, PR‑AUC, F1, precision/recall
Decision metrics: confusion matrix at chosen threshold
Calibration: reliability check if probabilities are used for ranking/pricing

For imbalanced problems, PR‑AUC often reflects real‑world performance better than accuracy.

🧩 Suggested project themes

Churn: predict which customers will leave next month
Fraud: detect suspicious transactions
Default risk: approve/deny credit decisions
Conversion: predict purchase from a campaign touch

🚀 Start here

Use Module 3 as your foundation (logistic regression + regularization). Build a clean baseline first, then iterate.

Open Module 3 (Logistic) → Back to Course Home

Tip: log your threshold choice and the reason (cost of false positives vs false negatives). This is what makes the model deployable.