Defining Model Task and Target
How to set the objective for your model?
Check This First!
This article refers to BaseModel accessed via Docker container. Please refer to Snowflake Native App section if you are using BaseModel as SF GUI application.
Selecting ML problem for your business objective
Foundation models can be adapted to a variety of different problems:
- Binary Classification
Suitable e.g. for churn prediction, no-show prediction, fraud detection etc.
- Multi-class Classification
E.g. to predicting buying propensity and assign customers to campaign products.
- Multi-label Classification
E.g. to predict buying propensity of multiple products for campaign personalization.
- Recommendation
E.g. to select products for web page personalization.
- Regression
E.g. to assess customer Life-Time Value, spend prediction etc.
- Clustering*
E.g. to segment customers for CRM purposes, or group products for different taxonomies.
Target function
The target function is a key component of the training module as it provides targets for model training. It allows you
to perform transformations on data to define the training objective — set the model's target. This way, the target
doesn't have to be present explicitly in the data table, which gives you modeling freedom.
Arguments:
history
(monad.targets.entity.Events
) — Events before the split point; e.g. the history of users' purchases.future
(monad.targets.entity.Events
) — Events after the split point; events that are used to create the target.entity
(monad.targets.entity.Attributes
) — Object containing entity attributes from the attribute datasource defined in
the pretrain phase.ctx
(Dict, optional) — Contains additional information available for target definition.
Data that are passed to the target function as history
and future
are Events
objects where user's events are available.
Timestamps
Timestamps of the events are crucial for behavioural modeling. If data has a timestamp of an undertaken action, it is considered to be an event. However, attributes also may contain the dates, nevertheless, we expect attribute data to have primarily one attribute per main_entity_id, not the events.
Timestamps are taken from date_column
which is provided in pretrain phase. We can access them inside the target function by timestamps = future["key_to_your_event_datasource"].timestamps
.
The target function should return an np.array
with target values for a single training example, e.g. the desired target for a single user whose history
and future
are passed to the target function. The exact target format will be specific to the type of task, such as recommendation, regression, binary, multiclass or multilabel classification. Description of tasks is available in Task-specific training.
Joined attributes
In case of having datasources with joined attributes, the target function require a special helper function that correctly accesses the joined attributes so that it can be used in target function in a simple way.
For that, use get_qualified_column_name
fuction:
Arguments:
- column_name: name of the column
- data_sources_path - list representing join hierarchy. It should omit the main data source
to which other data sources were joined.
Example:
There are three data sources:
+ "transactions",
+ "products",
+ "categories",
and the join hierarchy looks like this:
+ "categories" is joined to "products"
+ "products" is joined to "transactions". - When we want to use "category_name" column, originaly present in "categories", we should use
get_qualified_column_name
to qualify the column name. The correct use looks like follows:
get_qualified_column_name("category_name", ["products", "categories"])
Example can be viewed below.
Example target functions
Please refer to our library of Use Cases for examples and in-depth explanations on creating target functions.
Propensity
Propensity modeling is a statistical approach used in predictive analytics to estimate the likehood that a particular event will happen, for example:
- user will make a purchase
- user will purchase a specific item or items
- user will respond to a marketing campaign in a specific way
Thanks to this approach, business can tailor their strategies to target individuals or segments that are most likely to engage in desired action or most likely to change their behaviour in a demanded way.
See the following example use case:
🦉Use case: predict the probability the user will buy products in the following categoriesOpen Recipe
Propensity with joined attribute table
This example is similar to the above, the only difference is, articles information were kept in separate data source (articles
table) and were joined on Foundation Model training stage in the config.yaml
file.
def propensity_target_fn(history: Events, future: Events, attributes: Attributes, ctx: Dict) -> np.ndarray:
TARGET_NAMES = [
"Fruits",
"Dairy",
"Bakery",
"Meat and Poultry",
"Snacks and Confectionery",
"Beverages",
"Canned and Packaged Foods",
"Household Supplies",
"Personal Care",
"Cleaning Products"
]
TARGET_ENTITY = get_qualified_column_name(column_name="Product Category", data_sources_path=["articles"])
purchase_target, _ = (
future["purchases"]
.groupBy(TARGET_ENTITY)
.exists(groups=TARGET_NAMES)
)
# Excluding customers who did not buy anything
if purchase_target.sum()==0:
return None
return purchase_target
It will check if the user bought products in the provided categories and return an array with 1 if there was a
purchase and 0 if there was no purchase within the given category. Users that did not make any purchase will be excluded from training.
Use case: predicting the user's favourite category
🦉Use Case: predicting the user's favourite categoryOpen Recipe
Binary Classification
Binary Classification is a type of machine learning model, that categorizes outcomes into two distinct classes. For example we can predict:
- Determine if message is spam or not
- is the patient sick
- is the customer likely to churn
Churn modeling specifically is a binary classification problem applied in customer relationship management where typically two categories are being predicted : will churn
and will not churn
.
By identifying entities with highest probability to churn, businesses have opportunity to prevent unwanted churn by having this information upfront.
Use case: predicting the probability that the user will churn in the next n-days
🦉Use Case: probability to churn in next n-daysOpen Recipe
First, the users with no events in history
are excluded.
1 is returned when there are no transactions in future
, and
0 is returned when there are some purchases in the provided target time-window.
Use case: predicting the probability that the user will purchase a product from a list for price above threshold
🦉Use case: predicting the probability that the user will purchase a product from a list for price above thresholdOpen Recipe
Regression
Regression modeling is a statistical technique used in predictive analytics to estimate the relationship between a dependent variable (often referred to as the target variable) and one or more independent variables (often referred to as predictors or features). The aim of regression analysis is to understand how changes in the independent variables are associated with changes in the dependent variable. This type of modeling is extensively used across various fields such as finance, marketing, healthcare, and social sciences for forecasting, trend analysis, and decision making.
There are multiple use cases where we may want to apply regression modeling in the context of behavioral data:
- Customer Purchase Behaviour - like Lifetime Value prediction
- Employee Satisfaction and Performance Prediction
- User Engagement on Digital Platforms
Use case: predicting the amount of money spent or number of purchased items as a regression task
def money_spent_fn(history: Events, future: Events, attributes: Attributes, ctx: Dict):
sum_purchase = future["transactions"]["price"].events.sum()
return np.array([sum_purchase])
def items_bought_count_fn(history: Events, future: Events, attributes: Attributes, ctx: Dict):
items_bought_cnt = future["transactions"].count()
return np.array([items_bought_cnt], dtype=np.float32)
First function, money_spent_fn
, can be used as a target function for regression task as it returns the sum of the money spent per entity in the future. Simultaneously, second function, items_bought_count_fn
, can be used as a target function for regression task as it returns the number of purchased items per entity in the future.
Check detailed Recipe here:
🦉Use case: predicting the amount of money spent or number of purchased items as a regression taskOpen Recipe
Recommendation
Recommendation modeling is a specialized subset of machine learning that focuses on predicting the preferences or interests of users and suggesting items or services they are likely to enjoy or find useful.
This type of modeling is widely used in e-commerce, streaming services, and content platforms to personalise user experiences and increase engagement. Recommendation models analyse past user behavior, item characteristics, and sometimes contextual information to identify patterns and relationships between users and items.
In BaseModel recommendation target function require some specific parameters to work, that are described below.
Use case: recommender system for very next product basket that user will buy in the future as a recommendation task
🦉Use case: recommender system for very next product basket that user will buy in the future as a recommendation taskOpen Recipe
from monad.ui.target_function import Sketch, sequential_decay, sketch
def recommendation_fn(history: Events, future: Events, attributes: Attributes, ctx: Dict) -> Sketch:
future_transactions = future["transactions"]
article_ids = future["transactions"]["article_id"]
training_weights = sequential_decay(fututure_transactions, gamma=0)
return sketch(article_ids, training_weights)
Recommendation target functions require:
future
definition - need to define stream of future events for model to predict and validatesequential_decay
function that calculates the weights (importance) based on the order of the timestamps. By default, we are interested in first basket predicted, meaninggamma=0
.sketch
function that creates final representations of events, which are required for the recommendation target function. It is only needed in thereturn
clause and takesentities
andweights
as arguments._history
this parameter is mostly optional. We can use it to filter out products that were already purchased by users, so that only new items are being recommended.
BaseModel creates _sketches_
- advanced representation of entities. Sketches are mathematical structures used by BaseModel which are clearly described in our blog post.
Types of Decay Functions
For recommendation modeling, BaseModel supports several decay functions.
Sequential Decay
This method applies weights to the future purchases by assigning weights to the next in order in sequence baskets. It does not matter how much time has passed between events, just the sequence is important.
For example, we want to predict the next basket, in most cases we only need to predict the very next basket purchased, therefore we only should take into consideration the very next basket purchased in training. For this purpose we use the gamma
parameter.
gamma parameter can take any value between 0 and 1, for example:
gamma=0
- first event will have weight of 1, and the next one will have weight of 0. All subsequent ones will have 0 as well.gamma=1
- first event will have weight of 1, and all the next subsequent weight will be equal to 1 as well. This means, model will use all future events for training when trying to serve recommendation.gamma=0.5
or any other number between 0 and 1 - first event will have weight of 1, the next one will have weight of 0.5, the next one 0.25 etc. This means, events that happened later from the split point are less important for the training target.
Time Decay
It is possible to use Time Decay exactly the same way as sequential decay in the example above - in that case, not only the order of events will matter, but more how much time has passed between events. For example, we can assign time decay of 1 day and then events will get the smaller weights the more time has passed.
time decay uses a different parameter than sequential decay, and it is called daily_decay
. Depending on the value:
daily_decay=1
- first event will have weight of 1, and the next one will have weight of 0. All subsequent ones will have 0 as well.daily_decay=0
- first event will have weight of 1, and all the next subsequent weight will be equal to 1 as well. This means, model will use all future events for training when trying to serve recommendation.daily_decay=0.5
or any other number between 0 and 1 - first event will have weight of 1, the one happening a day later will have weight of 0.5, the next day 0.25 etc. If the next one happens 2 days after, this would mean weight of 0.25_0.5_0.5 = 0.0625.
Please note
in our case, we are predicting the next basket. The basket is defined as products that have the same timestamp at the time of purchase.
Possible Operations in the Target Function
There are a few possible operations which might be implemented within the target function on both history
and future
event objects. Following operations are possible:
Operation | Description | Arguments | Returns |
---|---|---|---|
count() | Return the number of events. | None | Number of events - Integer |
apply() | Applies a function to a target column. | - func (Callable[[Any], Any]) : Function to apply - target (str) : Target column name | DataSourceEvents: Events with column target transformed |
filter() | Filters events based on a condition. | - by (str) : Column to check condition against - condition (Callable[[Any], bool]) : Filtering condition | DataSourceEvents: Filtered events |
groupBy() | Groups the events by values in a column. | - by (Union[str, List[str]]) : Columns to group by | Requires one of the operators listed below to return anything |
On GroupBy()
we can additionally do the following operations
Operation | Description | Arguments | Returns |
---|---|---|---|
count() | Counts elements in each group (within groupBy()). | - normalize (Optional[bool], optional) : Normalize counts (default: False) - groups (Optional[List[Any]], optional) : Limit grouping (default: None) | Tuple [np.ndarray, List[str]]: Tuple with counts elements per each group and group names |
sum() | Sums values of a column in each group. | - target (str) : Column for grouping - groups (Optional[List[Any]], optional) : Limit grouping (default: None) | Tuple [np.ndarray, List[str]]: Tuple with sum of elements per each group and group names. |
mean() | Computes mean of column values in each group. | - target (str) : Column for grouping - groups (Optional[List[Any]], optional) : Limit grouping (default: None) | Tuple [np.ndarray, List[str]]: Tuple with mean of elements per each group and group names. |
min() | Finds minimum of column values in each group. | - target (str) : Column for grouping - groups (Optional[List[Any]], optional) : Limit grouping (default: None) | Tuple [np.ndarray, List[str]]: Tuple with min of the elements per each group and group names. |
max() | Finds maximum of column values in each group. | - target (str) : Column for grouping - groups (Optional[List[Any]], optional) : Limit grouping (default: None) | Tuple [np.ndarray, List[str]]: Tuple with max of the elements per each group and group names. |
exists() | Checks if groups are not empty. | - groups (List[Any]) : Groups to check | Tuple [np.ndarray, List[str]]: Tuple with array indicating existence of the elements per each group and group names. |
apply() | Applies a function to each group. | - func (Callable[[np.ndarray], Any]) : Function to apply - default_value (Any) : Default output - target (str) : Column for grouping - groups (Optional[List[Any]], optional) : Limit grouping (default: None) | Tuple [Any, List[str]]: Tuple with values returned by func per each group and group names. |
Validating Target function
For complex modeling needs, writing a target function can be more complicated. In order to quickly validate the function and it's output and to endure it models exactly what author wanted to, we implement a validation function that can be run before executing the entire downstream task.
verify_target function:
Argument | Type | Description |
---|---|---|
target_fn | TargetFunction | Target function to evaluate. |
fm_checkpoint_path | str | Path | path to FM checkpoint |
task | Task | task for which the target function will be applied |
data_params_overrides | DataParams | overrides for data parameters |
num_percentage_entities | int | Percentage of all entites to validate target_fn against. Defaults to 1. |
percentage_nones_allowed | int | Allowed percentage of invalid targets. |
The validation function will raise one of the error types:
- TypeError - if return types are incorrect or inconsistent
- ValueError - if
percentage_nones_allowed
evaluation returns None - RuntimeError - if running target function fails.
If the validation goes smoothly it returns example output of the target function.
Updated 19 days ago