Target Function: Inputs

⚠️
Check This First!
This article refers to BaseModel accessed via Docker container. Please refer to Snowflake Native App section if you are using BaseModel as SF GUI application.

In this article, we will cover the inputs to target functions — the events and entity attributes that you can transform to obtain the output type and value suitable for your business scenario.

Target Function Parameters

When defining a scenario target function, we pass several standard parameters as below:

def target_fn(history: Events, future: Events, entity: Attributes, ctx: Dict) -> np.ndarray:

The definition of these parameters are:

history :monad.targets.entity.Events, required
Events from all event type data sources with timestamps before the split point. These are used for model features, eg. the history of users' purchases.
future :monad.targets.entity.Events, required
Events from all event type data sources with timestamps after the split point These will be used as the target.
attributes :monad.targets.entity.Attributes, optional
Object containing entity attributes from the main_entity_attribute data source defined in the pretrain phase. They can be used eg. for filtering purpose.
ctx:Dict, optional
Additional information wrt. model training is contained here. Some (eg. split_timestamp) can be useful for more refined target definition.

The syntax when referring to the standard data passed into the function is explained below:

When referring to all history or target events of particular type, eg. to count or group them, we use history['data_source_name'] or future['data_source_name']:

future['purchases']
Referring to particular column in history or future events, eg. to sum its values or filter on its value uses history['col_name'] or future['col_name']:

history['transactions']['price']
Accessing timestamp of events, eg. to find the first, or last one, requires calling history['data_source_name'].timestamps or future['data_source_name'].timestamps:

next_transaction_ts = future['transactions'].timestamps[0]

⚠️
Note
All events used by BaseModel for behavioral modeling have their timestamps. While the attributes may also contain dates, they should map 1:1 to main_entity_id, so that each attribute has exactly one value per entity ID.
Timestamps are taken from date_column provided in the YAML config file for the pretrain phase.

Joined attributes

If our foundation model used data sources with joined attributes, and we now want to use them in our target function for business scenario, we will require a special get_qualified_column_name helper function which will correctly access the joined attributes.

get_qualified_column_name('col_name', [list_of_datasources])

Arguments

column_name: name of the attribute column we intend to use in the function
data_sources_path - list representing join hierarchy. It should omit the main data source
to which other data sources were joined.

Example

There are three data sources, 'transactions', 'products' and 'categories'. The join hierarchy is like this:

'categories' table is joined to 'products',
'products' table is joined to 'transactions'.

If we want to use 'category_name' column from 'categories' data_source which was joined in YAML config file, we need to use get_qualified_column_name to qualify the column name as below:

  get_qualified_column_name('category_name', ['products', 'categories'])

For an end-to-end example of a target function with joined attribute please refer to this article.