Positive Reviews for All Items
Task type: BinaryClassificationTask
Industry: E-commerce
Customers who rate every item in an order highly are prime candidates for loyalty programs, referral incentives, and user-generated content campaigns. Identifying these highly satisfied customers before they even place their next order lets marketing teams prepare personalized post-purchase flows — review request emails, social sharing prompts, or ambassador program invitations — timed to arrive when satisfaction is at its peak.
What makes this advanced? Cross-event join via extra columns — the target function uses
.extra["order_id"]to link transactions with reviews, a pattern that requires configuring extra columns in the foundation model to make join keys available at prediction time.
Prerequisites
Before writing a target function you need:
-
A trained foundation model built on event data that includes the relevant data sources. The foundation model must be configured with
order_idas an extra column on the transactions data source — add this entry to your FM config YAML: -
The monad library installed in your environment.
- Data source(s):
transactions,reviews
Target Function
The target function tells monad how to label each entity for training. It receives four arguments:
| Argument | Type | Description |
|---|---|---|
history |
Events |
All events before the temporal split. |
future |
Events |
All events after the temporal split. |
attributes |
Attributes |
Static entity attributes. |
ctx |
Dict |
Context dictionary containing SPLIT_TIMESTAMP, data mode, etc. |
The function must return one of:
np.array([1], dtype=np.float32)— positive casenp.array([0], dtype=np.float32)— negative caseNone— exclude this entity from training
Full Example
import numpy as np
from datetime import timedelta
from typing import Dict
from monad.ui.target_function import Events, Attributes
from monad.ui.target_function import SPLIT_TIMESTAMP
from monad.ui.target_function import has_incomplete_training_window
# === Configuration ===
TRANSACTIONS_DATA_SOURCE = "transactions"
REVIEWS_DATA_SOURCE = "reviews"
POSITIVE_RATING_THRESHOLD = 8
def positive_reviews_target_fn(
history: Events,
future: Events,
attributes: Attributes,
ctx: Dict,
) -> np.ndarray | None:
"""Predict if customer leaves positive reviews for all items in next order."""
# 1. Get the next order ID from extra columns
future_orders = future[TRANSACTIONS_DATA_SOURCE].extra["order_id"]
if len(future_orders) == 0:
return np.array([0], dtype=np.float32)
next_basket_id = future_orders[0]
# 2. Get ratings for items in that order
next_basket_ratings = future[REVIEWS_DATA_SOURCE].filter(
"order_id", lambda x: x == next_basket_id
)["rating"].events
# 3. Handle missing ratings
if next_basket_ratings.dtype.kind == "f" and np.isnan(next_basket_ratings).any():
return np.array([0], dtype=np.float32)
# 4. Check if all ratings are positive
all_positive = np.all(next_basket_ratings > POSITIVE_RATING_THRESHOLD)
return np.array([int(all_positive)], dtype=np.float32)
Extra columns
This recipe uses .extra["order_id"] to access the order identifier from the
transactions data source. Extra columns must be configured when building the
foundation model — they are columns that are not used as features but are
carried through for use in target functions and post-processing.
Step-by-Step Breakdown
① Get next order from extra columns
future_orders = future[TRANSACTIONS_DATA_SOURCE].extra["order_id"]
if len(future_orders) == 0:
return np.array([0], dtype=np.float32)
next_basket_id = future_orders[0]
The .extra accessor retrieves columns that were configured as extra columns in the foundation model. Here, order_id is used as a join key to link transactions with their reviews. The first order ID in the future represents the customer's next order. If no future transactions exist, the entity is labeled negative.
② Retrieve ratings for that order
next_basket_ratings = future[REVIEWS_DATA_SOURCE].filter(
"order_id", lambda x: x == next_basket_id
)["rating"].events
The reviews data source is filtered to only include reviews that match the next order's ID. The ["rating"] accessor extracts the numerical rating values. This cross-event join connects two different data sources (transactions and reviews) through the shared order_id.
③ Handle NaN ratings
if next_basket_ratings.dtype.kind == "f" and np.isnan(next_basket_ratings).any():
return np.array([0], dtype=np.float32)
Missing ratings (NaN values) indicate items that were not reviewed. Since we cannot confirm positive sentiment for unreviewed items, these entities are conservatively labeled negative. This prevents the model from treating incomplete data as positive evidence.
④ Check all-positive condition
all_positive = np.all(next_basket_ratings > POSITIVE_RATING_THRESHOLD)
return np.array([int(all_positive)], dtype=np.float32)
The label is positive only if every single item in the order received a rating above the threshold (8). np.all enforces this strict criterion — even one mediocre rating means the entity is labeled negative. This identifies truly delighted customers, not just satisfied ones.
Training
Once the target function is defined, fine-tune a downstream model:
from pathlib import Path
from monad.ui.config import TrainingParams, MetricParams, MetricMonitoringMode
from monad.config.early_stopping import EarlyStopping
from monad.ui.module import load_from_foundation_model, BinaryClassificationTask
module = load_from_foundation_model(
checkpoint_path=Path("./foundation_model"),
downstream_task=BinaryClassificationTask(),
target_fn=positive_reviews_target_fn,
)
training_params = TrainingParams(
checkpoint_dir=Path("./<this_model>"),
learning_rate=1e-4,
epochs=20,
devices=[0],
metrics=[
MetricParams(alias="auroc", metric_name="AUROC", kwargs={"task": "binary"}),
MetricParams(alias="auprc", metric_name="AveragePrecision", kwargs={"task": "binary"}),
MetricParams(alias="recall", metric_name="Recall", kwargs={"task": "binary"}),
MetricParams(alias="precision", metric_name="Precision", kwargs={"task": "binary"}),
],
metric_to_monitor="val_auroc_0",
metric_monitoring_mode=MetricMonitoringMode.MAX,
early_stopping=EarlyStopping(min_delta=1e-4, patience=5),
)
module.fit(training_params, seed=42)
Evaluation
from pathlib import Path
from datetime import datetime, timezone
from monad.ui.module import load_from_checkpoint
from monad.ui.config import TestingParams, MetricParams, OutputType
module = load_from_checkpoint(Path("./<this_model>"))
testing_params = TestingParams(
prediction_date=datetime(2024, 5, 1, tzinfo=timezone.utc),
output_type=OutputType.DECODED,
devices=[0],
metrics=[
MetricParams(alias="auroc", metric_name="AUROC"),
MetricParams(alias="auprc", metric_name="AveragePrecision"),
MetricParams(alias="recall", metric_name="Recall"),
],
)
results = module.test(testing_params)
Prediction
from pathlib import Path
from datetime import datetime, timezone
from monad.ui.module import load_from_checkpoint
from monad.ui.config import TestingParams, OutputType
module = load_from_checkpoint(Path("./<this_model>"))
testing_params = TestingParams(
local_save_location=Path("./predictions.tsv"),
output_type=OutputType.DECODED,
prediction_date=datetime(2024, 6, 1, tzinfo=timezone.utc),
devices=[0],
)
predictions = module.predict(testing_params)
Recommended Metrics
| Metric | Why it matters |
|---|---|
| AUROC | Measures overall ranking quality. |
| AUPRC | More informative when the positive class is rare. |
| Recall | Proportion of actual positives caught. |
| Precision | Proportion of predicted positives that are correct. |
| F1 Score | Harmonic mean of precision and recall. |
Production Tips
- Configure extra columns during foundation model training. The
order_idmust be declared as an extra column on the transactions data source when building the foundation model. Without this configuration,.extra["order_id"]will not be available at prediction time. - Choose the rating threshold carefully. A threshold of 8 (out of 10) is strict. Analyze your rating distribution — if most customers rate 7-8, a threshold of 8 captures only the top tier. Adjust based on whether you want to identify "satisfied" (threshold 6-7) or "delighted" (threshold 8-9) customers.
- Handle orders with missing reviews gracefully. Not all customers review every item. The current function labels these as negative, but you may want to exclude them (
return None) if the review rate is very low, to avoid biasing the model toward customers who review frequently. - Consider review timing. Reviews may arrive days or weeks after delivery. Ensure your future window is long enough to capture reviews for the order, or the model will undercount positive cases.
- Use predictions to time review requests. Score customers at the point of order placement, then send review request emails to high-probability customers first — they are most likely to leave positive reviews that boost your product ratings.