Waze-ML

Project: Understanding Waze User Churn

This project aims to predict user churn in the Waze app using machine learning techniques. The project comprises five notebooks, each focusing on a different aspect of the data analysis process.

Each notebook plays a crucial role in understanding the data, conducting analysis, building models, and providing insights to stakeholders regarding user churn in the Waze app.

Notebook 1: Data Inspection and Analysis

Churn vs Retained medians

1. Missing Values Analysis

2. Benefit of Using Median to Analyze Outliers

3. Further Questions

4. Percentage of Android vs. iPhone Users

5. Distinguishing Characteristics of Churned Users

6. Churn Rate Comparison Between iPhone and Android Users

Notebook 2: Exploratory Data Analysis (EDA)

1. Distributions of Variables

rigth-skew-example

2. Data Quality Assessment

driven-km-drives-boxplot

3. Further Questions

4. Churn and Retention Rates

Count-of-retained-vs-churned

5. Factors Correlated with Churn

Churn rate by mean km per driving day Churn rate per driving day

6. Representation of Newer Users

Num-days-after-onboarding

Notebook 3: Hypothesis Testing

$H_0$: The difference in the means is due to chance. So, both means are equal.

$H_A$: The difference in the means is statistically significant.

1. Mean Difference in Drives between iPhone and Android Users

2. Further Questions

Notebook 4: Logistic Regression Model Analysis

1. Influential Variable in Model’s Prediction

Feature-importance

2. Unexpected Predictor Variables

corr

3. Importance of Variables in the Model

4. Recommendation for Model Usage

5. Improving the Model

6. Additional Features for Model Improvement

Notebook 5: Random Forest and XGboost Models

1. Model Recommendation for Churn Prediction

2. Model Performance Metrics

metrics

3. Champion Model Evaluation

4. XGBoost Model Performance and Feature Importance

features-importance

5. Confusion Matrix Insights

matrix

6. Benefits of Logistic Regression Model

7. Benefits of Tree-Based Model Ensembles

8. Improving the Model

9. Desired Additional Features