
Churn_prediction
Last Updated 2 days agoAbout
This project presents a full end-to-end analytical system built and deployed on Zerve to solve a real-world business problem: predicting customer churn for a telecom company. Customer churn directly impacts revenue and growth, and early identification of at-risk customers enables proactive retention strategies.
The system ingests real customer data from the Telco Customer Churn dataset, performs robust data cleaning and feature engineering, trains multiple machine learning models, and deploys the best-performing model as a production-ready service.
Problem Statement
Telecom providers face high customer attrition due to competitive pricing and service alternatives. The goal of this project is to predict which customers are likely to churn, allowing business teams to target high-risk users with retention offers and reduce revenue loss.
Technical Approach
The pipeline is implemented as modular nodes in Zerve:
Data Ingestion & Cleaning:
Handle missing values, convert data types, and remove irrelevant identifiers to ensure high-quality input data.
Feature Engineering:
Create derived features such as tenure_months, avg_charge_per_month, and tenure buckets to better capture customer behavior and spending patterns.
Preprocessing Pipeline:
Use a ColumnTransformer with scaling for numeric features and one-hot encoding for categorical variables to ensure consistent transformations.
Model Training:
Train and compare multiple models including:
Logistic Regression (baseline, interpretable)
Random Forest
Gradient Boosting
Models are trained using an 80/20 stratified train-test split.
Validation & Model Selection
Model performance is evaluated using:
Accuracy, Precision, Recall, F1-score
ROC-AUC as the primary metric
Confusion matrices for error analysis
The best model is selected based on the highest ROC-AUC, with Recall as a tiebreaker to prioritize capturing potential churners.
Business KPIs
To demonstrate real business value, the system translates predictions into KPIs such as:
Test set churn rate
Recall@Top 20% risk customers to simulate targeted campaigns
Retention impact simulation assuming intervention success
Projected revenue saved, based on estimated customer lifetime value
These KPIs show how the model can materially reduce churn and improve revenue.
Deployment on Zerve
The final model and preprocessing pipeline are serialized and deployed on Zerve as:
A real-time API endpoint for scoring new customers, and/or
A scheduled batch workflow to score the entire customer base periodically.
The deployed service accepts raw customer records and returns churn probabilities and churn labels, making the solution production-ready and easy to integrate with business systems.
Outcome
The result is a measurable, scalable churn prediction system that:
Uses real-world data,
Provides validated model performance,
Produces actionable business insights, and
Is fully deployed to production using Zerve.
This project demonstrates strong skills in machine learning, data engineering, model validation, and MLOps-style deployment using Zerve.