Understanding Waze User Churn

Project Overview

This project aims to predict user churn in the Waze app using machine learning techniques. The project is hosted on GitHub Pages. To view the project, click here. If you’re interested in exploring the source code, you can find it here.

Notebooks Overview

  1. Data Understanding
    • Investigates and organizes the provided Waze dataset, including creating a pandas dataframe, compiling summary information about the data, and examining key variables.
  2. Exploratory Data Analysis (EDA)
    • Conducts exploratory data analysis on the dataset, including data cleaning, building visualizations, and evaluating and sharing results.
  3. Hypothesis Testing
    • Performs a two-sample hypothesis test (t-test) to analyze the difference in the mean amount of rides between iPhone and Android users, communicating insights with stakeholders.
  4. Logistic Regression Model
    • Performs exploratory data analysis and builds a binomial logistic regression model to predict user churn, evaluating model performance and interpreting results.
  5. Random Forest and XGBoost Models
    • Implements Random Forest and XGBoost models to identify factors driving user churn, providing recommendations for next steps and model comparison.

Importance of Each Notebook

Each notebook plays a crucial role in understanding the data, conducting analysis, building models, and providing insights to stakeholders regarding user churn in the Waze app.