Skip to content

Extra Services Propensity Scoring

Task type: MultilabelClassificationTask Industry: Car Rental / Travel

This recipe predicts which add-on services (GPS, child seat, insurance, etc.) a rental customer is likely to opt into on their next booking. The output is a binary vector covering all target services, enabling targeted upsell offers at the booking flow or confirmation email.

Why multilabel? Customers often select multiple add-ons per rental (e.g., GPS and child seat and insurance). Multilabel classification predicts each service independently, capturing all combinations.


Prerequisites

Before writing a target function you need:

  • A trained foundation model built on event data that includes a contracts data source with a service column listing opted-in add-ons.
  • The monad library installed in your environment (for Python App).

Target Function

Argument Type Description
history Events All events before the temporal split.
future Events All events after the temporal split.
attributes Attributes Static entity attributes.
ctx Dict Context dictionary containing SPLIT_TIMESTAMP, data mode, etc.

For multilabel classification, the function must return one of:

  • A 1-D float32 array of size num_labels — binary indicators (0 or 1) per service.
  • Noneexclude this customer (e.g., no future bookings, incomplete window).

Full Example

Python
import numpy as np
from datetime import timedelta
from typing import Dict

from monad.ui.target_function import Events, Attributes
from monad.ui.target_function import SPLIT_TIMESTAMP
from monad.ui.target_function import has_incomplete_training_window



# === Configuration ===
TARGET_WINDOW_DAYS = 21
CONTRACT_DATA_SOURCE = "contracts"
SERVICE_COLUMN = "service"
TARGET_SERVICES = [
    "extra_driver",
    "GPS",
    "wifi_hotspot",
    "child_seat",
    "personal_insurance",
    "damage_waiver",
    "road_assistance",
    "winter_tires_chains",
    "roof_box",
    "hotel_transfer",
]


def service_propensity_target_fn(
    history: Events,
    future: Events,
    attributes: Attributes,
    ctx: Dict,
) -> np.ndarray | None:
    """Score propensity to opt in for each add-on service."""

    # 1. Ensure the training window is long enough
    target_window = timedelta(days=TARGET_WINDOW_DAYS)
    if has_incomplete_training_window(ctx, target_window):
        return None

    # 2. Trim future events to the target window
    future = future.interval_from(ctx[SPLIT_TIMESTAMP], target_window)

    # 3. Check which services the customer opted into
    service_labels, _ = (
        future[CONTRACT_DATA_SOURCE]
        .groupBy(SERVICE_COLUMN)
        .exists(groups=TARGET_SERVICES)
    )

    # 4. Exclude customers with no future bookings at all
    if service_labels.sum() == 0:
        return None

    return service_labels
Python
def service_propensity_target_fn(
    history: target_function.Events,
    future: target_function.Events,
    attributes: target_function.Attributes,
    ctx: Dict,
) -> np.ndarray | None:
    """Score propensity to opt in for each add-on service."""

    # === Configuration ===
    TARGET_WINDOW_DAYS = 21
    CONTRACT_DATA_SOURCE = "contracts"
    SERVICE_COLUMN = "service"
    TARGET_SERVICES = [
        "extra_driver",
        "GPS",
        "wifi_hotspot",
        "child_seat",
        "personal_insurance",
        "damage_waiver",
        "road_assistance",
        "winter_tires_chains",
        "roof_box",
        "hotel_transfer",
    ]

    # 1. Ensure the training window is long enough
    target_window = timedelta(days=TARGET_WINDOW_DAYS)
    if target_function.has_incomplete_training_window(ctx, target_window):
        return None

    # 2. Trim future events to the target window
    future = future.interval_from(ctx[target_function.SPLIT_TIMESTAMP], target_window)

    # 3. Check which services the customer opted into
    service_labels, _ = (
        future[CONTRACT_DATA_SOURCE]
        .groupBy(SERVICE_COLUMN)
        .exists(groups=TARGET_SERVICES)
    )

    # 4. Exclude customers with no future bookings at all
    if service_labels.sum() == 0:
        return None

    return service_labels

Step-by-Step Breakdown

① Validate the training window

Python
target_window = timedelta(days=TARGET_WINDOW_DAYS)
if has_incomplete_training_window(ctx, target_window):
    return None
Python
target_window = timedelta(days=TARGET_WINDOW_DAYS)
if target_function.has_incomplete_training_window(ctx, target_window):
    return None

Skips samples with insufficient future data.

② Trim future events

Python
future = future.interval_from(ctx[SPLIT_TIMESTAMP], target_window)
Python
future = future.interval_from(ctx[target_function.SPLIT_TIMESTAMP], target_window)

Narrows events to exactly 21 days.

③ Detect opted-in services

Python
service_labels, _ = (
    future[CONTRACT_DATA_SOURCE]
    .groupBy(SERVICE_COLUMN)
    .exists(groups=TARGET_SERVICES)
)
  • groupBy(SERVICE_COLUMN) groups contract events by the service add-on.
  • .exists(groups=TARGET_SERVICES) returns a binary array: 1 if at least one contract included that service, 0 otherwise.
  • Example output: [0, 1, 0, 1, 1, 0, 0, 0, 0, 0] means the customer opted for GPS, child seat, and personal insurance.

④ Exclude customers with no bookings

Python
if service_labels.sum() == 0:
    return None

Customers with no future contracts cannot opt into anything — they add no signal. Returning None excludes them from training.

Tip: If you want to include customers who booked but chose zero add-ons (as negative examples), remove this filter. The trade-off is more training samples but a noisier signal.


Training

Python
from pathlib import Path
from monad.ui.config import TrainingParams, MetricParams, MetricMonitoringMode
from monad.config.early_stopping import EarlyStopping

from monad.ui.module import load_from_foundation_model, MultilabelClassificationTask


module = load_from_foundation_model(
    checkpoint_path=Path("./foundation_model"),
    downstream_task=MultilabelClassificationTask(class_names=TARGET_SERVICES),
    target_fn=service_propensity_target_fn,
)

training_params = TrainingParams(
    checkpoint_dir=Path("./<this_model>"),
    learning_rate=1e-4,
    epochs=20,
    devices=[0],
    metrics=[
        MetricParams(alias="auroc", metric_name="AUROC", kwargs={"task": "multilabel", "num_labels": <num_labels>}),
        MetricParams(alias="auprc", metric_name="AveragePrecision", kwargs={"task": "multilabel", "num_labels": <num_labels>}),
        MetricParams(alias="f1", metric_name="F1Score", kwargs={"task": "multilabel", "num_labels": <num_labels>}),
    ],
    metric_to_monitor="val_auroc_0",
    metric_monitoring_mode=MetricMonitoringMode.MAX,
    early_stopping=EarlyStopping(min_delta=1e-4, patience=5),
)

module.fit(training_params, seed=42)

Evaluation

Python
from pathlib import Path
from datetime import datetime, timezone
from monad.ui.module import load_from_checkpoint
from monad.ui.config import TestingParams, MetricParams, OutputType

module = load_from_checkpoint(Path("./<this_model>"))

testing_params = TestingParams(
    prediction_date=datetime(2024, 5, 1, tzinfo=timezone.utc),
    output_type=OutputType.DECODED,
    devices=[0],
    metrics=[
        MetricParams(alias="auroc", metric_name="AUROC"),
        MetricParams(alias="auprc", metric_name="AveragePrecision"),
        MetricParams(alias="f1", metric_name="F1Score"),
    ],
)

results = module.test(testing_params)

Prediction

Python
from pathlib import Path
from datetime import datetime, timezone
from monad.ui.module import load_from_checkpoint
from monad.ui.config import TestingParams, OutputType

module = load_from_checkpoint(Path("./<this_model>"))

testing_params = TestingParams(
    local_save_location=Path("./predictions.tsv"),
    output_type=OutputType.DECODED,
    prediction_date=datetime(2024, 6, 1, tzinfo=timezone.utc),
    devices=[0],
)

predictions = module.predict(testing_params)

Variations

Seasonal services only

Filter to services relevant to the current season (e.g., winter tires in winter):

Python
WINTER_SERVICES = ["winter_tires_chains", "road_assistance"]
SUMMER_SERVICES = ["roof_box", "wifi_hotspot"]

# Use the appropriate list based on the prediction date
TARGET_SERVICES = WINTER_SERVICES  # or SUMMER_SERVICES
Python
def winter_service_propensity_target_fn(
    history: target_function.Events,
    future: target_function.Events,
    attributes: target_function.Attributes,
    ctx: Dict,
) -> np.ndarray | None:
    # === Configuration ===
    TARGET_WINDOW_DAYS = 21
    CONTRACT_DATA_SOURCE = "contracts"
    SERVICE_COLUMN = "service"
    TARGET_SERVICES = ["winter_tires_chains", "road_assistance"]

    target_window = timedelta(days=TARGET_WINDOW_DAYS)
    if target_function.has_incomplete_training_window(ctx, target_window):
        return None
    future = future.interval_from(ctx[target_function.SPLIT_TIMESTAMP], target_window)

    service_labels, _ = (
        future[CONTRACT_DATA_SOURCE]
        .groupBy(SERVICE_COLUMN)
        .exists(groups=TARGET_SERVICES)
    )

    if service_labels.sum() == 0:
        return None

    return service_labels

Weight by revenue

If some add-ons generate more revenue than others, use this information to prioritize which services to promote (applied post-prediction, not in the target function).


Metric Why it matters
AUROC (per label) Ranking quality for each service independently.
F1 Score (micro) Overall correctness across all services combined.
Hamming Loss Fraction of labels incorrectly predicted — lower is better.

Production Tips

  1. Surface predictions at booking time. Show the top 2–3 highest-scoring add-ons in the booking flow as "Recommended for you" to increase conversion.

  2. Bundle services. Combine predictions with pricing logic to offer discounted bundles of likely add-ons (e.g., "GPS + Child Seat package").

  3. Consider trip context. Different trip types (business vs. family holiday) drive different add-on needs. If trip metadata is available, include it as an attribute.

  4. Update the service list regularly. New add-ons or retired services should be reflected in TARGET_SERVICES and the model retrained.