
Vibe Coding a PLG Lead Scoring Pipeline
Last Updated 3 days agoAbout
This project simulates realistic SaaS user and company data, including onboarding progress, feature adoption, session activity, and support interactions. It integrates these behaviors with company attributes to create a comprehensive dataset for modeling. Using this synthetic data, the project builds a machine learning lead scoring pipeline with a Random Forest classifier to identify high-conversion users. The model highlights key behavioral and demographic signals that predict paid conversion. Visualizations of lead score distributions, conversion rates, and feature importance drive actionable insights for sales, product, and marketing teams.
Overall, this end-to-end data science pipeline demonstrates synthetic data generation, feature engineering, supervised modeling, and interpretation of business-critical signals to optimize product-led growth strategies.