Churn_prediction
Churn_predictionahmedhaji2504

Churn_prediction

Last Updated 2 days ago

About

This project presents a full end-to-end analytical system built and deployed on Zerve to solve a real-world business problem: predicting customer churn for a telecom company. Customer churn directly impacts revenue and growth, and early identification of at-risk customers enables proactive retention strategies.

The system ingests real customer data from the Telco Customer Churn dataset, performs robust data cleaning and feature engineering, trains multiple machine learning models, and deploys the best-performing model as a production-ready service.


Problem Statement


Telecom providers face high customer attrition due to competitive pricing and service alternatives. The goal of this project is to predict which customers are likely to churn, allowing business teams to target high-risk users with retention offers and reduce revenue loss.


Technical Approach


The pipeline is implemented as modular nodes in Zerve:


Data Ingestion & Cleaning:

Handle missing values, convert data types, and remove irrelevant identifiers to ensure high-quality input data.


Feature Engineering:

Create derived features such as tenure_months, avg_charge_per_month, and tenure buckets to better capture customer behavior and spending patterns.


Preprocessing Pipeline:

Use a ColumnTransformer with scaling for numeric features and one-hot encoding for categorical variables to ensure consistent transformations.


Model Training:

Train and compare multiple models including:


Logistic Regression (baseline, interpretable)


Random Forest


Gradient Boosting

Models are trained using an 80/20 stratified train-test split.


Validation & Model Selection


Model performance is evaluated using:


Accuracy, Precision, Recall, F1-score


ROC-AUC as the primary metric


Confusion matrices for error analysis


The best model is selected based on the highest ROC-AUC, with Recall as a tiebreaker to prioritize capturing potential churners.


Business KPIs


To demonstrate real business value, the system translates predictions into KPIs such as:


Test set churn rate


Recall@Top 20% risk customers to simulate targeted campaigns


Retention impact simulation assuming intervention success


Projected revenue saved, based on estimated customer lifetime value


These KPIs show how the model can materially reduce churn and improve revenue.


Deployment on Zerve


The final model and preprocessing pipeline are serialized and deployed on Zerve as:


A real-time API endpoint for scoring new customers, and/or


A scheduled batch workflow to score the entire customer base periodically.


The deployed service accepts raw customer records and returns churn probabilities and churn labels, making the solution production-ready and easy to integrate with business systems.


Outcome


The result is a measurable, scalable churn prediction system that:


Uses real-world data,


Provides validated model performance,


Produces actionable business insights, and


Is fully deployed to production using Zerve.


This project demonstrates strong skills in machine learning, data engineering, model validation, and MLOps-style deployment using Zerve.

Share:
X

Build something you can ship

Explore, analyze and deploy your first project in minutes