âś… Deliverables
- Problem statement: who acts on predictions and what action is taken.
- Label definition: how you define positive vs negative class (and why).
- Baseline: simple logistic regression baseline with clear features.
- Final model: improved pipeline (regularization + feature selection + evaluation).
- Decision threshold: choose a threshold using costs/benefits (not default 0.5).
📏 Recommended evaluation
- Core metrics: ROC‑AUC, PR‑AUC, F1, precision/recall
- Decision metrics: confusion matrix at chosen threshold
- Calibration: reliability check if probabilities are used for ranking/pricing
For imbalanced problems, PR‑AUC often reflects real‑world performance better than accuracy.
đź§© Suggested project themes
- Churn: predict which customers will leave next month
- Fraud: detect suspicious transactions
- Default risk: approve/deny credit decisions
- Conversion: predict purchase from a campaign touch
🚀 Start here
Use Module 3 as your foundation (logistic regression + regularization). Build a clean baseline first, then iterate.
Tip: log your threshold choice and the reason (cost of false positives vs false negatives). This is what makes the model deployable.