Product Returns Within 30 Days
Task type: BinaryClassificationTask
Industry: E-commerce / Retail
Product returns are one of the largest cost drivers in e-commerce — they involve reverse logistics, restocking, and often a lost customer. By predicting which deliveries are likely to result in a return within 30 days, retailers can trigger proactive interventions: post-delivery satisfaction check-ins, personalized usage guides, or preemptive discount offers to keep the item.
What makes this advanced? Cross-event join — the target function matches delivery events with return events using
order_id, then computes the time delta between delivery and return timestamps to determine if the return falls within the 30-day policy window.
Prerequisites
Before writing a target function you need:
- A trained foundation model built on event data that includes the relevant data sources.
- The monad library installed in your environment.
- Data source(s):
deliveries_events,returns_events
Target Function
The target function tells monad how to label each entity for training. It receives four arguments:
| Argument | Type | Description |
|---|---|---|
history |
Events |
All events before the temporal split. |
future |
Events |
All events after the temporal split. |
attributes |
Attributes |
Static entity attributes. |
ctx |
Dict |
Context dictionary containing SPLIT_TIMESTAMP, data mode, etc. |
The function must return one of:
np.array([1], dtype=np.float32)— positive casenp.array([0], dtype=np.float32)— negative caseNone— exclude this entity from training
Full Example
import numpy as np
from datetime import timedelta
from typing import Dict
from monad.ui.target_function import Events, Attributes
from monad.ui.target_function import SPLIT_TIMESTAMP
from monad.ui.target_function import has_incomplete_training_window
from monad.constants import SECONDS_PER_DAY
# === Configuration ===
TARGET_WINDOW_DAYS = 30
DELIVERIES_DATA_SOURCE = "deliveries_events"
RETURNS_DATA_SOURCE = "returns_events"
def product_return_target_fn(
history: Events,
future: Events,
attributes: Attributes,
ctx: Dict,
) -> np.ndarray | None:
"""Predict if customer returns a product within 30 days of delivery."""
# 1. No returns at all — label as no-return
if not len(future[RETURNS_DATA_SOURCE]):
return np.array([0], dtype=np.float32)
# 2. Iterate over deliveries and match with returns by order_id
for delivery_ts, order_id in zip(
future[DELIVERIES_DATA_SOURCE]["order_id"].timestamps,
future[DELIVERIES_DATA_SOURCE]["order_id"],
):
order_return = future[RETURNS_DATA_SOURCE].filter(
"order_id", lambda oid: oid == order_id,
)
if not len(order_return):
continue
# 3. Check if return happened within 30 days
return_ts = order_return.timestamps[0]
if 0 <= (return_ts - delivery_ts) < TARGET_WINDOW_DAYS * SECONDS_PER_DAY:
return np.array([1], dtype=np.float32)
return np.array([0], dtype=np.float32)
Step-by-Step Breakdown
① Check for any returns
A quick early exit: if there are no return events at all in the future, the customer did not return anything. This avoids unnecessary iteration over deliveries and labels the entity as negative immediately.
② Iterate deliveries matching by order_id
for delivery_ts, order_id in zip(
future[DELIVERIES_DATA_SOURCE]["order_id"].timestamps,
future[DELIVERIES_DATA_SOURCE]["order_id"],
):
order_return = future[RETURNS_DATA_SOURCE].filter(
"order_id", lambda oid: oid == order_id,
)
if not len(order_return):
continue
The function loops through each delivery event, extracting both its timestamp and order_id. For each delivery, it filters the returns data source to find a matching return with the same order_id. This cross-event join is the core pattern — linking two different event streams by a shared identifier.
③ Compare timestamps
return_ts = order_return.timestamps[0]
if 0 <= (return_ts - delivery_ts) < TARGET_WINDOW_DAYS * SECONDS_PER_DAY:
return np.array([1], dtype=np.float32)
When a matching return is found, the function checks whether it happened within 30 days of delivery by comparing timestamps. The first matching return within the window is sufficient to label the entity as positive. If no delivery-return pair falls within the window, the entity is labeled negative.
Training
Once the target function is defined, fine-tune a downstream model:
from pathlib import Path
from monad.ui.config import TrainingParams, MetricParams, MetricMonitoringMode
from monad.config.early_stopping import EarlyStopping
from monad.ui.module import load_from_foundation_model, BinaryClassificationTask
module = load_from_foundation_model(
checkpoint_path=Path("./foundation_model"),
downstream_task=BinaryClassificationTask(),
target_fn=product_return_target_fn,
)
training_params = TrainingParams(
checkpoint_dir=Path("./<this_model>"),
learning_rate=1e-4,
epochs=20,
devices=[0],
metrics=[
MetricParams(alias="auroc", metric_name="AUROC", kwargs={"task": "binary"}),
MetricParams(alias="auprc", metric_name="AveragePrecision", kwargs={"task": "binary"}),
MetricParams(alias="recall", metric_name="Recall", kwargs={"task": "binary"}),
MetricParams(alias="precision", metric_name="Precision", kwargs={"task": "binary"}),
],
metric_to_monitor="val_auroc_0",
metric_monitoring_mode=MetricMonitoringMode.MAX,
early_stopping=EarlyStopping(min_delta=1e-4, patience=5),
)
module.fit(training_params, seed=42)
Evaluation
from pathlib import Path
from datetime import datetime, timezone
from monad.ui.module import load_from_checkpoint
from monad.ui.config import TestingParams, MetricParams, OutputType
module = load_from_checkpoint(Path("./<this_model>"))
testing_params = TestingParams(
prediction_date=datetime(2024, 5, 1, tzinfo=timezone.utc),
output_type=OutputType.DECODED,
devices=[0],
metrics=[
MetricParams(alias="auroc", metric_name="AUROC"),
MetricParams(alias="auprc", metric_name="AveragePrecision"),
MetricParams(alias="recall", metric_name="Recall"),
],
)
results = module.test(testing_params)
Prediction
from pathlib import Path
from datetime import datetime, timezone
from monad.ui.module import load_from_checkpoint
from monad.ui.config import TestingParams, OutputType
module = load_from_checkpoint(Path("./<this_model>"))
testing_params = TestingParams(
local_save_location=Path("./predictions.tsv"),
output_type=OutputType.DECODED,
prediction_date=datetime(2024, 6, 1, tzinfo=timezone.utc),
devices=[0],
)
predictions = module.predict(testing_params)
Recommended Metrics
| Metric | Why it matters |
|---|---|
| AUROC | Measures overall ranking quality. |
| AUPRC | More informative when the positive class is rare. |
| Recall | Proportion of actual positives caught. |
| Precision | Proportion of predicted positives that are correct. |
| F1 Score | Harmonic mean of precision and recall. |
Production Tips
- Align the window with your return policy. If your return policy is 14 or 60 days rather than 30, adjust
TARGET_WINDOW_DAYSaccordingly. The model should predict within the window where interventions can still prevent the return. - Consider partial returns. Some orders contain multiple items and customers may return only one. Depending on your business logic, you may want to label at the item level rather than the order level.
- Watch for timestamp accuracy in delivery events. Delivery timestamps may reflect carrier scan times, not actual receipt by the customer. A 1-2 day buffer in the window can account for this discrepancy.
- Enrich with product category features. Return rates vary dramatically by category (apparel vs electronics). Ensure your foundation model includes product attributes so the downstream model can learn category-specific patterns.
- Monitor for seasonal shifts. Holiday periods see both higher purchase volumes and higher return rates. Evaluate model performance separately for peak and off-peak periods to ensure consistent accuracy.