🚀 Lab Mission: Build Your $52M Decision Tree

Code a production-ready loan approval system in 90 minutes

⏱️ Time Remaining: 90:00
Complete all tasks to save Regional Bank!
1
2
3
4
5
6
7
📊 Task 1: Load and Explore Loan Data 10 points

Begin by loading the bank's historical loan data and understanding its structure. This dataset contains 50,000 loan applications from the past 5 years.

💡 Hint: Key features to consider:
  • credit_score (300-850 range)
  • annual_income (in USD)
  • debt_to_income (percentage)
  • employment_years
  • loan_amount
  • previous_defaults (count)
Remember to handle missing values and encode categorical variables!
> Ready to execute your code...
🔧 Task 2: Prepare Data for Training 15 points

Split the data strategically and handle class imbalance. Remember: defaults are rare but costly!

💡 Hint: For cost-sensitive learning:
# Calculate the cost ratio
cost_ratio = 25000 / 3500  # ≈ 7.14
# This means false negatives are 7x more expensive!

# Use 'balanced' or custom weights
class_weight = {0: 1, 1: cost_ratio}
> Ready to execute your code...
🌳 Task 3: Build Your First Decision Tree 20 points

Create a decision tree that balances accuracy with interpretability. Remember: regulators require explanations!

💡 Hint: Key parameters for business constraints:
  • max_depth=5 - Deep enough to capture patterns but still interpretable
  • min_samples_split=20 - Regulatory requirement for statistical significance
  • class_weight='balanced' - Automatically adjusts for imbalanced classes
  • criterion='gini' - Faster than entropy for large datasets
> Ready to execute your code...
🏁 Checkpoint 1: Basic Model Complete

Great work! You've built a basic decision tree. Current performance:

Accuracy
--
Precision
--
Recall
--
F1 Score
--
💰 Task 4: Calculate Financial Impact 25 points

The moment of truth! Calculate the actual dollar impact of your model's decisions.

💡 Hint: Financial calculation breakdown:
# Each correct rejection saves a potential $25,000 loss
saved_from_defaults = tp * 25000

# Each incorrect rejection loses $3,500 in profit
lost_opportunities = fp * 3500  

# Each correct approval earns $3,500
earned_from_loans = tn * 3500

# Each incorrect approval costs $25,000
losses_from_defaults = fn * 25000

# Net impact (want this positive and large!)
total_impact = (saved_from_defaults + earned_from_loans - 
                lost_opportunities - losses_from_defaults)
> Ready to execute your code...
📈 Task 5: Optimize for Maximum Value 20 points

Find the optimal tree depth that maximizes financial impact while maintaining interpretability.

💡 Hint: Look for the sweet spot where:
  • Financial impact plateaus (more depth doesn't help)
  • Tree remains interpretable (< 100 leaves)
  • Validation performance doesn't drop (no overfitting)
Typically optimal depth is between 4-6 for loan data.
> Ready to execute your code...
📝 Task 6: Create Decision Explanations 15 points

Regulatory compliance requires explaining every loan decision. Build a system to extract human-readable rules.

💡 Hint: To traverse the decision path:
# Get indices of nodes in path
node_indicator = decision_path.toarray()[0]

# For each node in the path
for node_id in range(len(node_indicator)):
    if node_indicator[node_id] == 1:
        # This node is in the path
        if model.tree_.children_left[node_id] != -1:
            # Not a leaf - add the rule
            feature_id = model.tree_.feature[node_id]
            threshold_value = model.tree_.threshold[node_id]
            # Determine direction based on sample value
> Ready to execute your code...
🚀 Task 7: Deploy to Production 15 points

Final step! Package your model for production deployment with monitoring and safeguards.

💡 Hint: Production considerations:
  • Add input validation for all features
  • Log all decisions for audit trail
  • Monitor drift in feature distributions
  • Set up alerts for unusual patterns
  • Include model versioning
> Ready to execute your code...
🏆 Final Performance Assessment
Model Accuracy
87%
Annual Savings
$52M
Approval Rate
65%
Processing Time
47ms
Explainability
100%
Lab Score
--/100

🎉 Mission Accomplished!

Outstanding work! You've built a production-ready decision tree system that will save Regional Bank $52 million annually.

Your Achievements:

Skills Mastered: