Defining Model Task and Target

How to set the objective for your model?

Selecting ML problem for your business objective

Foundation models can be adapted to a variety of different problems:

  • Binary Classification
    Suitable e.g. for churn prediction, no-show prediction, fraud detection etc.

  • Multi-class Classification
    E.g. to predicting buying propensity and assign customers to campaign products.

  • Multi-label Classification
    E.g. to predict buying propensity of multiple products for campaign personalization.

  • Recommendation
    E.g. to select products for web page personalization.

  • Regression
    E.g. to assess customer Life-Time Value, spend prediction etc.

  • Clustering*
    E.g. to segment customers for CRM purposes, or group products for different taxonomies.

Target function

The target function is a key component of the training module as it provides targets for model training. It allows you
to perform transformations on data to define the training objective β€” set the model's target. This way, the target
doesn't have to be present explicitly in the data table, which gives you modeling freedom.

Arguments:

  • history (monad.targets.entity.Events) β€” Events before the split point; e.g. the history of users' purchases.
  • future (monad.targets.entity.Event) β€” Events after the split point; events that are used to create the target.
  • entity (monad.targets.entity.Attributes) β€” Object containing entity attributes from the attribute datasource defined in
    the pretrain phase.
  • ctx (Dict, optional) β€” Contains additional information available for target definition.

Data that are passed to the target function as history and future are Events objects where user's events are available.

Timestamps

Timestamps of the events are crucial for behavioural modeling. If data has a timestamp of an undertaken action, it is considered to be an event. However, attributes also may contain the dates, nevertheless, we expect attribute data to have primarily one attribute per main_entity_id, not the events.

Timestamps are taken from date_column which is provided in pretrain phase. We can access them inside the target function by timestamps = future["key_to_your_event_datasource"].timestamps.

The target function should return an np.array with target values for a single training example, e.g. the desired target for a single user whose history and future are passed to the target function. The exact target format will be specific to the type of task, such as recommendation, regression, binary, multiclass or multilabel classification. Description of tasks is available in Task-specific training.

Joined attributes

In case of having datasources with joined attributes, the target function require a special helper function that correctly accesses the joined attributes so that it can be used in target function in a simple way.

For that, use get_qualified_column_name fuction:

Arguments:

  • column_name: name of the column
  • data_sources_path - list representing join hierarchy. It should omit the main data source
    to which other data sources were joined.
    Example:
    There are three data sources:
    + "transactions",
    + "products",
    + "categories",
    and the join hierarchy looks like this:
    + "categories" is joined to "products"
    + "products" is joined to "transactions".
  • When we want to use "category_name" column, originaly present in "categories", we should use get_qualified_column_name to qualify the column name. The correct use looks like follows:
  get_qualified_column_name("category_name", ["products", "categories"])

Example can be viewed below.

Example target functions

Please refer to our library of Use Cases for examples and in-depth explanations on creating target functions.

Propensity

Propensity modeling is a statistical approach used in predictive analytics to estimate the likehood that a particular event will happen, for example:

  • user will make a purchase
  • user will purchase a specific item or items
  • user will respond to a marketing campaign in a specific way

Thanks to this approach, business can tailor their strategies to target individuals or segments that are most likely to engage in desired action or most likely to change their behaviour in a demanded way.

See the following example use case:

ℹ️

Propensity with joined attribute table

This example is similar to the above, the only difference is, articles information were kept in separate data source (articles table) and were joined on Foundation Model training stage in the config.yaml file.


def propensity_target_fn(_history: Events, future: Events, _entity: Attributes, _ctx: Dict) -> np.ndarray:
    TARGET_NAMES = [
        "Fruits",
        "Dairy",
        "Bakery",
        "Meat and Poultry",
        "Snacks and Confectionery",
        "Beverages",
        "Canned and Packaged Foods",
        "Household Supplies",
        "Personal Care",
        "Cleaning Products"

    ]
TARGET_ENTITY = get_qualified_column_name(column_name="Product Category", data_sources_path=["articles"])

    purchase_target, _ = (
      future["purchases"]
      .groupBy(TARGET_ENTITY)
      .exists(groups=TARGET_NAMES)
    )
    
    # Excluding customers who did not buy anything
    if purchase_target.sum()==0:
        return None
    
    return purchase_target

It will check if the user bought products in the provided categories and return an array with 1 if there was a
purchase and 0 if there was no purchase within the given category. Users that did not make any purchase will be excluded from training.

Use case: predicting the user's favourite category

ℹ️


Binary Classification

Binary Classification is a type of machine learning model, that categorizes outcomes into two distinct classes. For example we can predict:

  • Determine if message is spam or not
  • is the patient sick
  • is the customer likely to churn

Churn modeling specifically is a binary classification problem applied in customer relationship management where typically two categories are being predicted : will churn and will not churn.

By identifying entities with highest probability to churn, businesses have opportunity to prevent unwanted churn by having this information upfront.

Use case: predicting the probability that the user will churn in the next n-days

ℹ️

First, the users with no events in history are excluded.
1 is returned when there are no transactions in future, and
0 is returned when there are some purchases in the provided target time-window.

Use case: predicting the probability that the user will purchase a product from a list for price above threshold

ℹ️


Regression

Regression modeling is a statistical technique used in predictive analytics to estimate the relationship between a dependent variable (often referred to as the target variable) and one or more independent variables (often referred to as predictors or features). The aim of regression analysis is to understand how changes in the independent variables are associated with changes in the dependent variable. This type of modeling is extensively used across various fields such as finance, marketing, healthcare, and social sciences for forecasting, trend analysis, and decision making.

There are multiple use cases where we may want to apply regression modeling in the context of behavioral data:

  • Customer Purchase Behaviour - like Lifetime Value prediction
  • Employee Satisfaction and Performance Prediction
  • User Engagement on Digital Platforms

Use case: predicting the amount of money spent or number of purchased items as a regression task

def money_spent_fn(_history: Events, future: Events, _entity: Attributes, _ctx: Dict):
    sum_purchase = future["transactions"]["price"].events.sum()

    return np.array([sum_purchase])


def items_bought_count_fn(_history: Events, future: Events, _entity: Attributes, _ctx: Dict):
    items_bought_cnt = future["transactions"].count()

    return np.array([items_bought_cnt], dtype=np.float32)

First function, money_spent_fn, can be used as a target function for regression task as it returns the sum of the money spent per entity in the future. Simultaneously, second function, items_bought_count_fn, can be used as a target function for regression task as it returns the number of purchased items per entity in the future.

Check detailed Recipe here:

πŸ“˜


Recommendation

Recommendation modeling is a specialized subset of machine learning that focuses on predicting the preferences or interests of users and suggesting items or services they are likely to enjoy or find useful.

This type of modeling is widely used in e-commerce, streaming services, and content platforms to personalise user experiences and increase engagement. Recommendation models analyse past user behavior, item characteristics, and sometimes contextual information to identify patterns and relationships between users and items.

In BaseModel recommendation target function require some specific parameters to work, that are described below.

Use case: recommender system for very next product basket that user will buy in the future as a recommendation task

ℹ️

from monad.ui.target_function import Sketch, sequential_decay, sketch

def recommendation_fn(_history: Events, future: Events, _entity: Attributes, _ctx: Dict) -> Sketch:
    future_transactions = future["transactions"]
    article_ids = future["transactions"]["article_id"]
    training_weights = sequential_decay(fututure_transactions, gamma=0)
    return sketch(article_ids, training_weights)

Recommendation target functions require:

  • future definition - need to define stream of future events for model to predict and validate
  • sequential_decay function that calculates the weights (importance) based on the order of the timestamps. By default, we are interested in first basket predicted, meaninggamma=0.
  • sketch function that creates final representations of events, which are required for the recommendation target function. It is only needed in the return clause and takes entities and weights as arguments.
  • _historythis parameter is mostly optional. We can use it to filter out products that were already purchased by users, so that only new items are being recommended.

BaseModel creates _sketches_ - advanced representation of entities. Sketches are mathematical structures used by BaseModel which are clearly described in our blog post.

Types of Decay Functions

For recommendation modeling, BaseModel supports several decay functions.

Sequential Decay

This method applies weights to the future purchases by assigning weights to the next in order in sequence baskets. It does not matter how much time has passed between events, just the sequence is important.

For example, we want to predict the next basket, in most cases we only need to predict the very next basket purchased, therefore we only should take into consideration the very next basket purchased in training. For this purpose we use the gamma parameter.

gamma parameter can take any value between 0 and 1, for example:

  • gamma=0- first event will have weight of 1, and the next one will have weight of 0. All subsequent ones will have 0 as well.
  • gamma=1- first event will have weight of 1, and all the next subsequent weight will be equal to 1 as well. This means, model will use all future events for training when trying to serve recommendation.
  • gamma=0.5 or any other number between 0 and 1 - first event will have weight of 1, the next one will have weight of 0.5, the next one 0.25 etc. This means, events that happened later from the split point are less important for the training target.

Time Decay

It is possible to use Time Decay exactly the same way as sequential decay in the example above - in that case, not only the order of events will matter, but more how much time has passed between events. For example, we can assign time decay of 1 day and then events will get the smaller weights the more time has passed.

time decay uses a different parameter than sequential decay, and it is called daily_decay. Depending on the value:

  • daily_decay=1- first event will have weight of 1, and the next one will have weight of 0. All subsequent ones will have 0 as well.
  • daily_decay=0- first event will have weight of 1, and all the next subsequent weight will be equal to 1 as well. This means, model will use all future events for training when trying to serve recommendation.
  • daily_decay=0.5 or any other number between 0 and 1 - first event will have weight of 1, the one happening a day later will have weight of 0.5, the next day 0.25 etc. If the next one happens 2 days after, this would mean weight of 0.25_0.5_0.5 = 0.0625.

πŸ“˜

Please note

in our case, we are predicting the next basket. The basket is defined as products that have the same timestamp at the time of purchase.

Possible Operations in the Target Function

There are a few possible operations which might be implemented within the target function on both history and future event objects. Following operations are possible:

OperationDescriptionArgumentsReturns
count()Return the number of events.NoneNumber of events - Integer
apply()Applies a function to a target column.- func (Callable[[Any], Any]): Function to apply - target (str): Target column nameDataSourceEvents: Events with column target transformed
filter()Filters events based on a condition.- by (str): Column to check condition against - condition (Callable[[Any], bool]): Filtering conditionDataSourceEvents: Filtered events
groupBy()Groups the events by values in a column.- by (Union[str, List[str]]): Columns to group byRequires one of the operators listed below to return anything

On GroupBy() we can additionally do the following operations

OperationDescriptionArgumentsReturns
count()Counts elements in each group (within groupBy()).- normalize (Optional[bool], optional): Normalize counts (default: False) - groups (Optional[List[Any]], optional): Limit grouping (default: None)Tuple [np.ndarray, List[str]]: Tuple with counts elements per each group and group names
sum()Sums values of a column in each group.- target (str): Column for grouping - groups (Optional[List[Any]], optional): Limit grouping (default: None)Tuple [np.ndarray, List[str]]: Tuple with sum of elements per each group and group names.
mean()Computes mean of column values in each group.- target (str): Column for grouping - groups (Optional[List[Any]], optional): Limit grouping (default: None)Tuple [np.ndarray, List[str]]: Tuple with mean of elements per each group and group names.
min()Finds minimum of column values in each group.- target (str): Column for grouping - groups (Optional[List[Any]], optional): Limit grouping (default: None)Tuple [np.ndarray, List[str]]: Tuple with min of the elements per each group and group names.
max()Finds maximum of column values in each group.- target (str): Column for grouping - groups (Optional[List[Any]], optional): Limit grouping (default: None)Tuple [np.ndarray, List[str]]: Tuple with max of the elements per each group and group names.
exists()Checks if groups are not empty.- groups (List[Any]): Groups to checkTuple [np.ndarray, List[str]]: Tuple with array indicating existence of the elements per each group and group names.
apply()Applies a function to each group.- func (Callable[[np.ndarray], Any]): Function to apply - default_value (Any): Default output - target (str): Column for grouping - groups (Optional[List[Any]], optional): Limit grouping (default: None)Tuple [Any, List[str]]: Tuple with values returned by func per each group and group names.

Validating Target function

For complex modeling needs, writing a target function can be more complicated. In order to quickly validate the function and it's output and to endure it models exactly what author wanted to, we implement a validation function that can be run before executing the entire downstream task.

verify_target function:

ArgumentTypeDescription
target_fnTargetFunctionTarget function to evaluate.
fm_checkpoint_pathstr | Pathpath to FM checkpoint
taskTasktask for which the target function will be applied
data_params_overridesDataParamsoverrides for data parameters
num_percentage_entitiesintPercentage of all entites to validate target_fn against. Defaults to 1.
percentage_nones_allowedintAllowed percentage of invalid targets.

The validation function will raise one of the error types:

  • TypeError - if return types are incorrect or inconsistent
  • ValueError - if percentage_nones_allowed evaluation returns None
  • RuntimeError - if running target function fails.

If the validation goes smoothly it returns example output of the target function.