Target Function Inputs
Events, main entity attributes and joined attributes
Check This First!
This article refers to BaseModel accessed via Docker container. Please refer to Snowflake Native App section if you are using BaseModel as SF GUI application.
In this article, we will cover the inputs to target functions — the events and entity attributes that you can transform to obtain the output type and value suitable for your business scenario.
Target Function Parameters
When defining a scenario target function, we pass several standard parameters as below:
def target_fn(history: Events, future: Events, entity: Attributes, ctx: Dict) -> np.ndarray:
The definition of these parameters are:
-
history : monad.targets.entity.Events, required
Events from allevent
type data sources with timestamps before the split point.
These are used for model features, eg. the history of users' purchases. -
future : monad.targets.entity.Events, required
Events from allevent
type data sources with timestamps after the split point
These will be used as the target. -
attributes : monad.targets.entity.Attributes, optional
Object containing entity attributes from themain_entity_attribute
data source defined in the pretrain phase.
They can be used eg. for filtering purpose. -
ctx: Dict, optional
Additional information wrt. model training is contained here.
Some (eg.split_timestamp
) can be useful for more refined target definition.
The syntax when referring to the standard data passed into the function is explained below:
-
When referring to all history or target events of particular type, eg. to count or group them, we use history['data_source_name'] or future['data_source_name']:
future['purchases']
-
Referring to particular column in history or future events, eg. to sum its values or filter on its value uses history['data_source_name']['col_name'] or future['data_source_name']['col_name']:
history['transactions']['price']
-
Accessing timestamp of events, eg. to find the first, or last one, requires calling history['data_source_name'].timestamps or future['data_source_name'].timestamps:
next_transaction_ts = future['transactions'].timestamps[0]
Note
All events used by BaseModel for behavioral modeling have their timestamps. While the attributes may also contain dates, they should map 1:1 to main_entity_id, so that each attribute has exactly one value per entity ID.
Timestamps are taken fromdate_column
provided in theYAML
config file for the pretrain phase.
Joined attributes
If our foundation model used data sources with joined attributes, and we now want to use them in our target function for business scenario, we will require a special get_qualified_column_name
helper function which will correctly access the joined attributes.
get_qualified_column_name('col_name', [list_of_datasources])
Arguments |
---|
- column_name: name of the attribute column we intend to use in the function
- data_sources_path - list representing join hierarchy. It should omit the main data source
to which other data sources were joined.
Example |
---|
There are three data sources, 'transactions', 'products' and 'categories'. The join hierarchy is like this:
- 'categories' table is joined to 'products',
- 'products' table is joined to 'transactions'.
If we want to use 'category_name' column from 'categories' data_source which was joined in YAML
config file, we need to use get_qualified_column_name
to qualify the column name as below:
get_qualified_column_name('category_name', ['products', 'categories'])
For an end-to-end example of a target function with joined attribute please refer to this article.
Updated 15 days ago