Binary Classification Target Function
Answering your yes / no business questions
Check This First!
This article refers to BaseModel accessed via Docker container. Please refer to Snowflake Native App section if you are using BaseModel as SF GUI application.
In this subpage we will look at some examples of target functions for Binary Classification - a machine learning model that classifies observations (entities) into one of two classes, commonly designated as 0 or 1.
These models help to answer all sorts of yes / no questions by outputting the probabilities of the classes, eg.:
- Will the customer lapse ("churn") or not,
- Will the patient show up for the appointment or not,
- Is the message spam or not,
- Will the shopper spend more than certain amount or not, etc.
Standard Template for Binary Classification Target Functions
Each function for classification problem will:
- accept as parameters history, future, entity and ctx, as described here,
- perform some transformation on these inputs as explained in this section
- output a one-dimensional numpy array of float32 data type and with a size of 1.
def target_fn(_history: Events, future: Events, _entity: Attributes, _ctx: Dict) -> np.ndarray:
# transformation of events into the desired target
return np.array(target, dtype=np.float32)
We will now explore the transformations looking at two examples of functions for binary classification problems.
Is a customer likely to churn?
An ideal model would return the label of 1 for every customer who would lapse, and 0 for all who would continue to engage. We should start by establishing a clear criterion related to events in history and future that would enable us to label customers like that.
The steps are as follow:
- For efficiency, we exclude customers with no transaction events in
history
(already churned) upfront, - We then flag the remaining customers (who had transaction events in
history
) and- no transaction events in
future
as 1 ( = "will churn"), - some transaction events in
future
as 0 ( = "will not churn"),
- no transaction events in
Click on the recipe below to see the entire code.
🦉RETAIL: identify customers likely to churn over next n-daysOpen Recipe
Will a customer spend more than a certain amount on a category?
Let's look at a more complex use case: targeting customers most likely to purchase products from a list categories and spend above certain threshold. The steps here will be as follow:
- First exclude all purchase events outside the focus product category,
- Sum the predicted spend amount (appropriate column of transaction events in
future
) by customer - Compare with the expected threshold and flag as 0 if the sum is below it, and as 1 if equal to it or above,
🦉RETAIL: flag shoppers likely to spend above a certain amount on specific categoriesOpen Recipe
Updated 15 days ago