data-science-job-salaries Canvas
About
This canvas is a comprehensive salary benchmarking and fair offer intelligence tool that builds a Ridge regression model to predict data science role salaries and classify them as Below/Fair/Above Market using 2020–2022 Kaggle salary data. The workflow spans data cleaning (565 rows, 42 duplicates removed), exploratory analysis across 8 dimensions (experience, location, title, remote, etc.), feature engineering (19 features, log-transformation of target), train/test splitting with temporal sensitivity checks, Ridge model training (R²=0.545, MAE=$34K), comparison against Random Forests, benchmark label generation using a ±38% model-relative band, and three visualizations (salary distribution by experience/label, feature importance, predicted vs actual scatter)—all flowing into a final markdown report with business recommendations and model limitations.



