User Engagement & 30-Day Retention Prediction
About
Comprehensive data science workflow analyzing 4,774 users to predict 30-day retention using behavioral features from first 7 days of activity. The canvas implements a complete machine learning pipeline: data loading β cleaning β feature engineering β retention label creation β exploratory data analysis β multiple model training (RF for 30-day retention + time-windowed Day 1/Day 3 models) β persona clustering β comprehensive visualizations (ROC curves, feature importance, retention curves, persona analysis) β production scoring system. Achieves 83.5% ROC-AUC on 30-day retention with key drivers being time span (46.5%), total events (15.3%), and active days (13.9%), while identifying behavioral personas ranging from low-engagement explorers (1% retention) to mid-engaged explorers (17% retention) with actionable product recommendations for improving multi-day engagement loops.


