Advanced recommendation predictions
Check This First!
This article refers to BaseModel accessed via Docker container. Please refer to Snowflake Native App section if you are using BaseModel as SF GUI application.
BaseModel provides powerful recommendation capabilities, and this section delves into how to retrieve detailed scoring information for target entities. While default predictions offer a ranked list, this guide explains how to access the underlying model scores, enabling more fine-grained analysis and customization of recommendations. This is particularly useful for users who need to understand the confidence or strength of each recommendation, or who wish to implement custom ranking or filtering logic.
The items that the recommendation model targets to recommend, like a product ID or category, are called target entities. For example, if the recommendation model is designed to recommend articles to users, then the target entities are the articles. Default predictions from the recommendation model are sorted from the most to the least recommended value of the target entity, but the predictions do not provide scores for individual values. In this section, we show how to get recommendation scores for all target entity values.
The process of generating recommendation scores consists of 3 steps.
- Computation of encoded predictions with
OutputType.ENCODED
parameter. - Conversion of encoded predictions into a vector of entity scores with
readout\_sketch
function and generation of a dictionary mapping score indexes to entity values withread\_target\_entity\_ids
function. - Transformation of the obtained data to the desired format.
In the first step we generate predictions with OutputType.ENCODED
to obtain model scores.
from monad.ui.config import OutputType, TestingParams
from monad.ui.module import load_from_checkpoint
from datetime import datetime
# declare variables
checkpoint_path = "<path/to/downstream/model/checkpoints>" # location of recommendation model checkpoints
save_path = "<path/to/predictions/my_predictions.tsv>" # location to store predictions
test_start_date = datetime(2023, 8, 1) # first day of prediction period
# load scenario model to instantiate testing module
testing_module = load_from_checkpoint(
checkpoint_path=checkpoint_path,
test_start_date=test_start_date,
)
# define testing parameters with encoded output
testing_params = TestingParams(
local_save_location=save_path,
output_type=OutputType.ENCODED,
)
# run inference
testing_module.predict(testing_params=testing_params)
The file with the predictions contains two columns: the first one with the main entity ID and the second one with scores separated by commas. These scores represent a mathematical representation of the model's predictions, called a "sketch." The sketch is not directly interpretable as scores for specific items.
In the second step below, we explain how to convert this sketch into a vector of scores that can be mapped to target entity values.
from monad.ui.module import readout_sketch, read_target_entity_ids
# declare variables
checkpoint_path = "<path/to/downstream/model/checkpoints>" # location of recommendation model checkpoints
save_path = "<path/to/predictions/my_predictions.tsv>" # location where encoded predictions are stored
target_feature_value = "<target_feature_value>"
# Get a generator yielding entity id and all scored products
sketch_readout_generator = readout_sketch(
predictions_file=predictions_file,
checkpoint_path=checkpoint_path,
)
# get mapping of target values to score index
target_to_index = read_target_entity_ids(checkpoint_path=checkpoint_path)
We instantiate the scored products generator by calling readout_sketch
and providing predictions_file
and checkpoint_path
. Returned generator will yield tuple[str, np.ndarray]
with entity id and an array of product scores. The array will be of the same length as the number of target entities and each element of the array corresponds to the score for one entity. The scores can be mapped to the original target feature values with the dictionary returned by read_target_entity_ids
function.
The function read_target_entity_ids
takes checkpoint_path
parameter and returns the dictionary with the following structure:
- keys - target feature values,
- values - indexes of products in the array returned by
readout_sketch
.
In the last step, you can use sketch\_readout\_generator
and target\_to\_index
objects to create a data structure they need. For example, you can print out scores for customer and products of interest or you can generate a score matrix with customers as rows and products as columns.
Example |
---|
The example provided below demonstrates how to obtain scores for target entity product id equal 12345
from the recommendation model predicting product ids.
from monad.ui.module import readout_sketch, read_target_entity_ids
# declare variables
checkpoint_path = "<path/to/downstream/model/checkpoints>" # location of recommendation model checkpoints
save_path = "<path/to/predictions/my_predictions.tsv>" # location where encoded predictions are stored
target_feature_value = "12345" # product id for which we want to get scores
# Get a generator yielding entity id and all scored products
scored_products_generator = readout_sketch(
predictions_file=predictions_file,
checkpoint_path=checkpoint_path,
)
# get mapping of target values to score index
target_to_index = read_target_entity_ids(checkpoint_path=checkpoint_path)
for entity_id, scored_products in scored_products_generator:
# score index of a selected product
score_index = target_to_index[target_feature_value]
# score of a selected product
score = scored_products[score_index]
print(f"The score for customer {entity_id} and product {target} is {score}")
The code above produces the following output.
The score for customer 791f478f72635a0c and product 0826646002 is -97.1341323852539
The score for customer d36058219d6ea7bf and product 0826646002 is -96.62920379638672
The score for customer a38bf2cd26a05a2 and product 0826646002 is -97.0897216796875
The score for customer a0229eefa0463ae and product 0826646002 is -96.88237762451172
The score for customer de847c3ee2169bc and product 0826646002 is -97.20885467529297
.
.
.
Updated about 1 month ago