I am a beginner at data science and machine learning and I created this project with the intent of upskilling myself. Working on this helped me gain hands-on insights into the cleaning of raw data and data analysis in a practical situation.
Propensity refers to the likelihood of someone doing something. A propensity model is a model that predicts the likelihood of the same action based on other related factors.
In this scenario, the dataset consists of a one-day summary of details of user behavior on a fictional website. Based on this behavior (eg. signed in, clicked on account page etc), the model learns to predict if the user finally placed an order on the website.
I then trained this model using the Gaussian Naive Bayes algorithm, to make predictions on unseen data (test sample) and decide if a user is likely to order from the website. The likelihood estimate will help focus marketing efforts selectively on the more valuable prospects, thus cutting down costs and optimizing profits.
I used this Kaggle dataset to train the propensity predictor model. The field names are intuitive and have additional explanatory descriptions provided.
For fellow learners who wish to recreate this model, here is a fairly simple code walk-through.