Let's start with supervised machine learning.
What is supervised machine learning?
As we learned previously, machine learning is a set of methods for making predictions based on existing data. Supervised machine learning is a subset of machine learning methods where the existing data has a specific structure: it has labels and features.
Some problems that can be solved by supervised machine learning include recommendation systems, email subject optimization and churn prediction.
Let's explore one of these problems and define these new terms with a case study.
Case: Churn Prediction
Suppose we have a subscription business and want to predict whether a given customer is likely to stay subscribed or churn.
First, we'll need some training data. This would be historical data from our customers. Some of those customers will have maintained their subscription, while others will have churned.
We eventually want to be able to predict the label for each customer: churned or subscribed.
We'll need features to make this prediction. Features are different pieces of information about each customer that might affect our label. For example, perhaps age, gender, date of last purchase, or household income will predict cancellations.
The magic of machine learning is that we can analyse many features all at once. We can use these labels and features to train a model to make predictions on new data.
Suppose we have a customer who may or may not churn soon. We can collect data on this customer, such as, age, or date of last purchase.
We can feed this data into our trained model and then, our trained model will give us a prediction.
If the customer is not in danger of churning, we can count on their revenue for another month! If they are in danger of churning, we can reach out to them with a special promotion or customer support to keep them subscribed.
Once we train a supervised machine learning model, how do we know if it's any good? When we collect historical data to train our model, it is always good practice to not feed all of it into our model. This withheld data is called a test set and can be used to evaluate the goodness of the model.
In our example, we could ask the model to predict whether a set of customers would churn, and then measure how often the predictions were accurate. It's important to note how often it incorrectly predicted that a customer wouldn't churn. Checking both outcomes is particularly important for rare events.
Suppose our subscription is amazing and only 3 percent of customers ever cancel. Then, a model could be overall 97 percent accurate just by always predicting that a customer will remain subscribed. Only by examining the accuracy of each class do we realized that it had 0 percent accuracy at predicting churn when churn was the actual outcome.