Credit Card Spend Prediction
Task type: RegressionTask
Industry: Banking / Financial Services
This recipe predicts how much a customer will spend on their credit card over the next N days. It is useful for credit limit management, revenue forecasting, fraud detection baselines, and identifying high-value customers for retention programs.
What is predicted? A single continuous value — the total outgoing spend (negative amounts in the data) over the target window. Repayments (positive amounts) are excluded.
Prerequisites
Before writing a target function you need:
- A trained foundation model built on event data that includes a
card_transactionsdata source with anamountcolumn (where negative values represent spending). - The monad library installed in your environment (for Python App).
Target Function
| Argument | Type | Description |
|---|---|---|
history |
Events |
All events before the temporal split. |
future |
Events |
All events after the temporal split. |
attributes |
Attributes |
Static entity attributes. |
ctx |
Dict |
Context dictionary containing SPLIT_TIMESTAMP, data mode, etc. |
For regression tasks, the function must return one of:
np.array([value], dtype=np.float32)— the predicted continuous value (total spend).None— exclude this customer (e.g., incomplete observation window).
Full Example
import numpy as np
from datetime import timedelta
from typing import Dict
from monad.ui.target_function import Events, Attributes
from monad.ui.target_function import SPLIT_TIMESTAMP
from monad.ui.target_function import has_incomplete_training_window
# === Configuration ===
TARGET_WINDOW_DAYS = 30
TRANSACTION_DATA_SOURCE = "card_transactions"
AMOUNT_COLUMN = "amount"
def credit_spend_target_fn(
history: Events,
future: Events,
attributes: Attributes,
ctx: Dict,
) -> np.ndarray | None:
"""Predict total credit card spending over the target window."""
# 1. Ensure the training window is long enough
target_window = timedelta(days=TARGET_WINDOW_DAYS)
if has_incomplete_training_window(ctx, target_window):
return None
# 2. Trim future events to the target window
future = future.interval_from(ctx[SPLIT_TIMESTAMP], target_window)
# 3. Filter to outgoing transactions only (negative amounts = spending)
spend_transactions = future[TRANSACTION_DATA_SOURCE].filter(
by=AMOUNT_COLUMN,
condition=lambda x: x < 0,
)
# 4. Sum total spend
total_spend = spend_transactions.sum(column=AMOUNT_COLUMN)
return np.array([total_spend], dtype=np.float32)
def credit_spend_target_fn(
history: target_function.Events,
future: target_function.Events,
attributes: target_function.Attributes,
ctx: Dict,
) -> np.ndarray | None:
"""Predict total credit card spending over the target window."""
# === Configuration ===
TARGET_WINDOW_DAYS = 30
TRANSACTION_DATA_SOURCE = "card_transactions"
AMOUNT_COLUMN = "amount"
# 1. Ensure the training window is long enough
target_window = timedelta(days=TARGET_WINDOW_DAYS)
if target_function.has_incomplete_training_window(ctx, target_window):
return None
# 2. Trim future events to the target window
future = future.interval_from(ctx[target_function.SPLIT_TIMESTAMP], target_window)
# 3. Filter to outgoing transactions only (negative amounts = spending)
spend_transactions = future[TRANSACTION_DATA_SOURCE].filter(
by=AMOUNT_COLUMN,
condition=lambda x: x < 0,
)
# 4. Sum total spend
total_spend = spend_transactions.sum(column=AMOUNT_COLUMN)
return np.array([total_spend], dtype=np.float32)
Step-by-Step Breakdown
① Validate the training window
Skips samples where the split leaves insufficient future data for a full 30-day observation.
② Trim future events
Narrows events to exactly 30 days.
③ Filter to spending transactions
spend_transactions = future[TRANSACTION_DATA_SOURCE].filter(
by=AMOUNT_COLUMN,
condition=lambda x: x < 0,
)
Credit card data typically uses negative values for purchases and positive values for repayments. This filter keeps only the spending side. Adjust the condition if your data uses a different convention.
④ Sum total spend
total_spend = spend_transactions.sum(column=AMOUNT_COLUMN)
return np.array([total_spend], dtype=np.float32)
The .sum() method aggregates the amount column across all matching transactions. The result is a single float returned as a 1-D float32 array of size 1.
Note: The returned value will be negative (since spending amounts are negative). If you prefer a positive value, use
abs(total_spend).
Training
from pathlib import Path
from monad.ui.config import TrainingParams, MetricParams, MetricMonitoringMode
from monad.config.early_stopping import EarlyStopping
from monad.ui.module import load_from_foundation_model, RegressionTask
module = load_from_foundation_model(
checkpoint_path=Path("./foundation_model"),
downstream_task=RegressionTask(num_targets=1),
target_fn=credit_spend_target_fn,
)
training_params = TrainingParams(
checkpoint_dir=Path("./<this_model>"),
learning_rate=1e-4,
epochs=20,
devices=[0],
metrics=[
MetricParams(alias="mae", metric_name="MeanAbsoluteError"),
MetricParams(alias="mse", metric_name="MeanSquaredError"),
MetricParams(alias="r2", metric_name="R2Score"),
],
metric_to_monitor="val_mae_0",
metric_monitoring_mode=MetricMonitoringMode.MIN,
early_stopping=EarlyStopping(min_delta=1e-4, patience=5),
)
module.fit(training_params, seed=42)
Evaluation
from pathlib import Path
from datetime import datetime, timezone
from monad.ui.module import load_from_checkpoint
from monad.ui.config import TestingParams, MetricParams, OutputType
module = load_from_checkpoint(Path("./<this_model>"))
testing_params = TestingParams(
prediction_date=datetime(2024, 5, 1, tzinfo=timezone.utc),
output_type=OutputType.DECODED,
devices=[0],
metrics=[
MetricParams(alias="mae", metric_name="MeanAbsoluteError"),
MetricParams(alias="mse", metric_name="MeanSquaredError"),
MetricParams(alias="r2", metric_name="R2Score"),
],
)
results = module.test(testing_params)
Prediction
from pathlib import Path
from datetime import datetime, timezone
from monad.ui.module import load_from_checkpoint
from monad.ui.config import TestingParams, OutputType
module = load_from_checkpoint(Path("./<this_model>"))
testing_params = TestingParams(
local_save_location=Path("./predictions.tsv"),
output_type=OutputType.DECODED,
prediction_date=datetime(2024, 6, 1, tzinfo=timezone.utc),
devices=[0],
)
predictions = module.predict(testing_params)
Variations
Category-specific spend
Predict spend in a specific merchant category (e.g., dining, travel):
def dining_spend_target_fn(
history: Events, future: Events, attributes: Attributes, ctx: Dict
) -> np.ndarray | None:
target_window = timedelta(days=TARGET_WINDOW_DAYS)
if has_incomplete_training_window(ctx, target_window):
return None
future = future.interval_from(ctx[SPLIT_TIMESTAMP], target_window)
dining_transactions = future[TRANSACTION_DATA_SOURCE].filter(
by="merchant_category",
condition=lambda x: x == "dining",
)
total_spend = dining_transactions.sum(column=AMOUNT_COLUMN)
return np.array([total_spend], dtype=np.float32)
def dining_spend_target_fn(
history: target_function.Events,
future: target_function.Events,
attributes: target_function.Attributes,
ctx: Dict,
) -> np.ndarray | None:
TARGET_WINDOW_DAYS = 30
TRANSACTION_DATA_SOURCE = "card_transactions"
AMOUNT_COLUMN = "amount"
target_window = timedelta(days=TARGET_WINDOW_DAYS)
if target_function.has_incomplete_training_window(ctx, target_window):
return None
future = future.interval_from(ctx[target_function.SPLIT_TIMESTAMP], target_window)
dining_transactions = future[TRANSACTION_DATA_SOURCE].filter(
by="merchant_category",
condition=lambda x: x == "dining",
)
total_spend = dining_transactions.sum(column=AMOUNT_COLUMN)
return np.array([total_spend], dtype=np.float32)
Exclude inactive cardholders
Skip customers with no historical spend — they may be dormant:
def credit_spend_active_only_target_fn(
history: target_function.Events,
future: target_function.Events,
attributes: target_function.Attributes,
ctx: Dict,
) -> np.ndarray | None:
# === Configuration ===
TARGET_WINDOW_DAYS = 30
TRANSACTION_DATA_SOURCE = "card_transactions"
AMOUNT_COLUMN = "amount"
# Exclude customers with no historical spend
if history[TRANSACTION_DATA_SOURCE].filter(
by=AMOUNT_COLUMN, condition=lambda x: x < 0
).count() == 0:
return None
target_window = timedelta(days=TARGET_WINDOW_DAYS)
if target_function.has_incomplete_training_window(ctx, target_window):
return None
future = future.interval_from(ctx[target_function.SPLIT_TIMESTAMP], target_window)
spend_transactions = future[TRANSACTION_DATA_SOURCE].filter(
by=AMOUNT_COLUMN,
condition=lambda x: x < 0,
)
total_spend = spend_transactions.sum(column=AMOUNT_COLUMN)
return np.array([total_spend], dtype=np.float32)
Recommended Metrics
| Metric | Why it matters |
|---|---|
| MAE | Average prediction error in the same units as spend — easy to interpret. |
| RMSE | Penalizes large errors more than MAE — useful when outlier accuracy matters. |
| R² Score | How much variance the model explains. Values above 0.5 indicate meaningful predictive power. |
Production Tips
-
Normalize spend by time active. Customers who received their card mid-month will naturally have lower spend. Normalize by the number of active days in the window.
-
Use predictions for credit limit reviews. Customers predicted to spend significantly more than their limit may benefit from a proactive limit increase.
-
Segment by card type. Business cards, premium cards, and basic cards have different spending patterns. Consider training separate models or adding card type as a feature.
-
Watch for seasonality. Holiday spending spikes and summer dips are predictable but real. Retrain monthly to capture seasonal patterns.