🏀Zerve chosen as NCAA's Agentic Data Platform for 2026 Hackathon·🏆Zerve × ODSC AI Datathon — $10k Prize Pool·📈We're hiring — awesome new roles just gone live!

Zerve_winner

guptavaibhab28
February 28, 2026

About

Zerve Hackathon Submission Summary

Project: Autonomous User Conversion Engine

Author: Vaibhav Gupta


1. The Question (Problem Statement)

In a freemium SaaS model, only a tiny fraction of users convert to a paid tier. The core question this project answers is: "Which specific behavioral patterns and product interactions reliably predict whether a free user will upgrade to a paid account?" By identifying these high-intent users early, the business can autonomously target them with premium trial offers, optimizing marketing spend and increasing revenue.


2. Success Definition

Success for this project was defined not by naive accuracy (which is misleading in highly imbalanced datasets), but by the ability to effectively separate noise from signal. Technical success was measured using the ROC-AUC score, aiming for a high discriminative capability. Business success was defined by deploying the model as an actionable inference function that outputs specific sales actions ("High Priority", "Warm Lead", "Low Priority") based on real-time event data.


3. Methodology

All development, data processing, and modeling were conducted natively within the Zerve Canvas to ensure full reproducibility and eliminate context-switching.


Data Ingestion & Cleaning: Analyzed ~400k raw event logs. Discovered that the target variable (prop_credits_used) was sparsely populated (94.7% missing), acting as an implicit label for upgraded users.


Feature Engineering: Aggregated the row-level event logs into 5,410 distinct user profiles, generating 84 behavioral features (e.g., unique sessions, total viewport widths, feature engagement counts) while actively dropping leaky variables to ensure model integrity.


Modeling: Identified a severe class imbalance (only 38 out of 5,410 users upgraded—a 0.70% conversion rate). Utilized a Gradient Boosting Classifier, applying a calculated class weight (scale_pos_weight = 141.37) to penalize minority class misclassifications.


Deployment: Wrapped the trained model into a production-ready API function that accepts a user ID, computes their conversion probability, and returns automated business logic via JSON.


4. Findings & Insights

The model achieved an outstanding ROC-AUC Score of 0.9868, proving that user upgrades are highly predictable based on platform behavior.


Key Insight 1: Total platform engagement over time is the ultimate driver of conversion. The most important predictive feature was timestamp_nunique (representing the raw volume of distinct actions a user took).


Key Insight 2: Deep technical usage matters. Features like _inserted_at_nunique and interactions with specific debugging tools (prop_$sdk_debug_replay_internal_buffer_size_sum) strongly correlated with paid upgrades, indicating that power-users testing the technical limits of the platform are the most likely to convert.

Related Topics

Decision-grade data work

Explore, analyze and deploy your first project in minutes