Predict Days Until First Complaint
Task type: RegressionTask
Industry: General / Customer Service
Knowing when a complaint will arrive is as valuable as knowing whether it will arrive. By predicting the number of days until a customer's first complaint, service teams can prioritise outreach to customers whose complaints are imminent, schedule proactive check-ins, and allocate support resources ahead of anticipated complaint spikes.
What makes this advanced? Time-to-event prediction — finds the minimum timestamp among complaint events, converts to days from the split point.
Prerequisites
Before writing a target function you need:
- A trained foundation model built on event data that includes the relevant data sources.
- The monad library installed in your environment.
- Data source(s):
complaint
Target Function
The target function tells monad how to label each entity for training. It receives four arguments:
| Argument | Type | Description |
|---|---|---|
history |
Events |
All events before the temporal split. |
future |
Events |
All events after the temporal split. |
attributes |
Attributes |
Static entity attributes. |
ctx |
Dict |
Context dictionary containing SPLIT_TIMESTAMP, data mode, etc. |
For regression tasks, the function must return one of:
np.array([value], dtype=np.float32)— the predicted continuous value (days until complaint).None— exclude this entity (e.g., no complaint in window, or incomplete data).
Full Example
import numpy as np
from datetime import timedelta
from typing import Dict
from monad.ui.target_function import Events, Attributes
from monad.ui.target_function import SPLIT_TIMESTAMP
from monad.ui.target_function import has_incomplete_training_window
# === Configuration ===
TARGET_WINDOW_DAYS = 90
COMPLAINT_DATA_SOURCE = "complaint"
def days_until_complaint_target_fn(
history: Events,
future: Events,
attributes: Attributes,
ctx: Dict,
) -> np.ndarray | None:
"""Predict days until first complaint within 90 days."""
split_ts = ctx[SPLIT_TIMESTAMP]
if has_incomplete_training_window(ctx, timedelta(days=TARGET_WINDOW_DAYS)):
return None
future_window = future.interval_from(split_ts, timedelta(days=TARGET_WINDOW_DAYS))
complaints = future_window[COMPLAINT_DATA_SOURCE]
if complaints.count() == 0:
return None
first_complaint_ts = np.min(complaints.timestamps)
days = (first_complaint_ts - split_ts) / 86400.0
return np.array([days], dtype=np.float32)
Step-by-Step Breakdown
① Validate the training window
Ensures 90 days of future data are available. Truncated windows would bias the model toward shorter time-to-event values.
② Trim future events to the target window
future_window = future.interval_from(split_ts, timedelta(days=TARGET_WINDOW_DAYS))
complaints = future_window[COMPLAINT_DATA_SOURCE]
Restricts complaints to the 90-day observation window. Complaints beyond this horizon are ignored.
③ Exclude customers with no complaints
Customers who do not complain within the window are excluded from training. This is a censored observation — the model only learns from customers who actually complained. For survival-analysis style modelling, consider encoding censored observations differently.
④ Compute days to first complaint
first_complaint_ts = np.min(complaints.timestamps)
days = (first_complaint_ts - split_ts) / 86400.0
return np.array([days], dtype=np.float32)
np.min finds the earliest complaint timestamp. Dividing the difference by 86,400 converts seconds to days. The result is a single float: 0.0 means a complaint on the split day, 45.0 means a complaint 45 days later.
Training
Once the target function is defined, fine-tune a downstream model:
from pathlib import Path
from monad.ui.config import TrainingParams, MetricParams, MetricMonitoringMode
from monad.config.early_stopping import EarlyStopping
from monad.ui.module import load_from_foundation_model, RegressionTask
module = load_from_foundation_model(
checkpoint_path=Path("./foundation_model"),
downstream_task=RegressionTask(num_targets=1),
target_fn=days_until_complaint_target_fn,
)
training_params = TrainingParams(
checkpoint_dir=Path("./<this_model>"),
learning_rate=1e-4,
epochs=20,
devices=[0],
metrics=[
MetricParams(alias="mae", metric_name="MeanAbsoluteError"),
MetricParams(alias="mse", metric_name="MeanSquaredError"),
MetricParams(alias="r2", metric_name="R2Score"),
],
metric_to_monitor="val_mae_0",
metric_monitoring_mode=MetricMonitoringMode.MIN,
early_stopping=EarlyStopping(min_delta=1e-4, patience=5),
)
module.fit(training_params, seed=42)
Evaluation
from pathlib import Path
from datetime import datetime, timezone
from monad.ui.module import load_from_checkpoint
from monad.ui.config import TestingParams, MetricParams, OutputType
module = load_from_checkpoint(Path("./<this_model>"))
testing_params = TestingParams(
prediction_date=datetime(2024, 5, 1, tzinfo=timezone.utc),
output_type=OutputType.DECODED,
devices=[0],
metrics=[
MetricParams(alias="mae", metric_name="MeanAbsoluteError"),
MetricParams(alias="mse", metric_name="MeanSquaredError"),
MetricParams(alias="r2", metric_name="R2Score"),
],
)
results = module.test(testing_params)
Prediction
from pathlib import Path
from datetime import datetime, timezone
from monad.ui.module import load_from_checkpoint
from monad.ui.config import TestingParams, OutputType
module = load_from_checkpoint(Path("./<this_model>"))
testing_params = TestingParams(
local_save_location=Path("./predictions.tsv"),
output_type=OutputType.DECODED,
prediction_date=datetime(2024, 6, 1, tzinfo=timezone.utc),
devices=[0],
)
predictions = module.predict(testing_params)
Recommended Metrics
| Metric | Why it matters |
|---|---|
| MAE | Average absolute error — intuitive and robust to outliers. |
| RMSE | Penalises large errors more heavily than MAE. |
| R² | Proportion of variance explained by the model. |
| MAPE | Percentage-based error — useful for comparing across scales. |
Production Tips
- Consider censored observations. Excluding non-complainers introduces survivorship bias. For a more robust approach, consider adding a binary classification head that predicts whether a complaint will occur, and only use the regression output when the classifier predicts positive.
- Log-transform the target. Days-to-event distributions are typically right-skewed. Applying
np.log1p(days)can improve model performance and prediction stability. - Segment by complaint severity. Not all complaints are equal. Train separate models for minor feedback vs. formal escalations to get more actionable predictions.
- Validate against business calendars. Complaint patterns often spike after weekends and holidays when support channels reopen. Account for business-day effects in your evaluation.