Skip to content

Credit Card Spend Prediction

Task type: RegressionTask Industry: Banking / Financial Services

This recipe predicts how much a customer will spend on their credit card over the next N days. It is useful for credit limit management, revenue forecasting, fraud detection baselines, and identifying high-value customers for retention programs.

What is predicted? A single continuous value — the total outgoing spend (negative amounts in the data) over the target window. Repayments (positive amounts) are excluded.


Prerequisites

Before writing a target function you need:

  • A trained foundation model built on event data that includes a card_transactions data source with an amount column (where negative values represent spending).
  • The monad library installed in your environment (for Python App).

Target Function

Argument Type Description
history Events All events before the temporal split.
future Events All events after the temporal split.
attributes Attributes Static entity attributes.
ctx Dict Context dictionary containing SPLIT_TIMESTAMP, data mode, etc.

For regression tasks, the function must return one of:

  • np.array([value], dtype=np.float32) — the predicted continuous value (total spend).
  • Noneexclude this customer (e.g., incomplete observation window).

Full Example

Python
import numpy as np
from datetime import timedelta
from typing import Dict

from monad.ui.target_function import Events, Attributes
from monad.ui.target_function import SPLIT_TIMESTAMP
from monad.ui.target_function import has_incomplete_training_window



# === Configuration ===
TARGET_WINDOW_DAYS = 30
TRANSACTION_DATA_SOURCE = "card_transactions"
AMOUNT_COLUMN = "amount"


def credit_spend_target_fn(
    history: Events,
    future: Events,
    attributes: Attributes,
    ctx: Dict,
) -> np.ndarray | None:
    """Predict total credit card spending over the target window."""

    # 1. Ensure the training window is long enough
    target_window = timedelta(days=TARGET_WINDOW_DAYS)
    if has_incomplete_training_window(ctx, target_window):
        return None

    # 2. Trim future events to the target window
    future = future.interval_from(ctx[SPLIT_TIMESTAMP], target_window)

    # 3. Filter to outgoing transactions only (negative amounts = spending)
    spend_transactions = future[TRANSACTION_DATA_SOURCE].filter(
        by=AMOUNT_COLUMN,
        condition=lambda x: x < 0,
    )

    # 4. Sum total spend
    total_spend = spend_transactions.sum(column=AMOUNT_COLUMN)

    return np.array([total_spend], dtype=np.float32)
Python
def credit_spend_target_fn(
    history: target_function.Events,
    future: target_function.Events,
    attributes: target_function.Attributes,
    ctx: Dict,
) -> np.ndarray | None:
    """Predict total credit card spending over the target window."""

    # === Configuration ===
    TARGET_WINDOW_DAYS = 30
    TRANSACTION_DATA_SOURCE = "card_transactions"
    AMOUNT_COLUMN = "amount"

    # 1. Ensure the training window is long enough
    target_window = timedelta(days=TARGET_WINDOW_DAYS)
    if target_function.has_incomplete_training_window(ctx, target_window):
        return None

    # 2. Trim future events to the target window
    future = future.interval_from(ctx[target_function.SPLIT_TIMESTAMP], target_window)

    # 3. Filter to outgoing transactions only (negative amounts = spending)
    spend_transactions = future[TRANSACTION_DATA_SOURCE].filter(
        by=AMOUNT_COLUMN,
        condition=lambda x: x < 0,
    )

    # 4. Sum total spend
    total_spend = spend_transactions.sum(column=AMOUNT_COLUMN)

    return np.array([total_spend], dtype=np.float32)

Step-by-Step Breakdown

① Validate the training window

Python
target_window = timedelta(days=TARGET_WINDOW_DAYS)
if has_incomplete_training_window(ctx, target_window):
    return None
Python
target_window = timedelta(days=TARGET_WINDOW_DAYS)
if target_function.has_incomplete_training_window(ctx, target_window):
    return None

Skips samples where the split leaves insufficient future data for a full 30-day observation.

② Trim future events

Python
future = future.interval_from(ctx[SPLIT_TIMESTAMP], target_window)
Python
future = future.interval_from(ctx[target_function.SPLIT_TIMESTAMP], target_window)

Narrows events to exactly 30 days.

③ Filter to spending transactions

Python
spend_transactions = future[TRANSACTION_DATA_SOURCE].filter(
    by=AMOUNT_COLUMN,
    condition=lambda x: x < 0,
)

Credit card data typically uses negative values for purchases and positive values for repayments. This filter keeps only the spending side. Adjust the condition if your data uses a different convention.

④ Sum total spend

Python
total_spend = spend_transactions.sum(column=AMOUNT_COLUMN)
return np.array([total_spend], dtype=np.float32)

The .sum() method aggregates the amount column across all matching transactions. The result is a single float returned as a 1-D float32 array of size 1.

Note: The returned value will be negative (since spending amounts are negative). If you prefer a positive value, use abs(total_spend).


Training

Python
from pathlib import Path
from monad.ui.config import TrainingParams, MetricParams, MetricMonitoringMode
from monad.config.early_stopping import EarlyStopping

from monad.ui.module import load_from_foundation_model, RegressionTask

module = load_from_foundation_model(
    checkpoint_path=Path("./foundation_model"),
    downstream_task=RegressionTask(num_targets=1),
    target_fn=credit_spend_target_fn,
)

training_params = TrainingParams(
    checkpoint_dir=Path("./<this_model>"),
    learning_rate=1e-4,
    epochs=20,
    devices=[0],
    metrics=[
        MetricParams(alias="mae", metric_name="MeanAbsoluteError"),
        MetricParams(alias="mse", metric_name="MeanSquaredError"),
        MetricParams(alias="r2", metric_name="R2Score"),
    ],
    metric_to_monitor="val_mae_0",
    metric_monitoring_mode=MetricMonitoringMode.MIN,
    early_stopping=EarlyStopping(min_delta=1e-4, patience=5),
)

module.fit(training_params, seed=42)

Evaluation

Python
from pathlib import Path
from datetime import datetime, timezone
from monad.ui.module import load_from_checkpoint
from monad.ui.config import TestingParams, MetricParams, OutputType

module = load_from_checkpoint(Path("./<this_model>"))

testing_params = TestingParams(
    prediction_date=datetime(2024, 5, 1, tzinfo=timezone.utc),
    output_type=OutputType.DECODED,
    devices=[0],
    metrics=[
        MetricParams(alias="mae", metric_name="MeanAbsoluteError"),
        MetricParams(alias="mse", metric_name="MeanSquaredError"),
        MetricParams(alias="r2", metric_name="R2Score"),
    ],
)

results = module.test(testing_params)

Prediction

Python
from pathlib import Path
from datetime import datetime, timezone
from monad.ui.module import load_from_checkpoint
from monad.ui.config import TestingParams, OutputType

module = load_from_checkpoint(Path("./<this_model>"))

testing_params = TestingParams(
    local_save_location=Path("./predictions.tsv"),
    output_type=OutputType.DECODED,
    prediction_date=datetime(2024, 6, 1, tzinfo=timezone.utc),
    devices=[0],
)

predictions = module.predict(testing_params)

Variations

Category-specific spend

Predict spend in a specific merchant category (e.g., dining, travel):

Python
def dining_spend_target_fn(
    history: Events, future: Events, attributes: Attributes, ctx: Dict
) -> np.ndarray | None:
    target_window = timedelta(days=TARGET_WINDOW_DAYS)
    if has_incomplete_training_window(ctx, target_window):
        return None
    future = future.interval_from(ctx[SPLIT_TIMESTAMP], target_window)

    dining_transactions = future[TRANSACTION_DATA_SOURCE].filter(
        by="merchant_category",
        condition=lambda x: x == "dining",
    )
    total_spend = dining_transactions.sum(column=AMOUNT_COLUMN)
    return np.array([total_spend], dtype=np.float32)
Python
def dining_spend_target_fn(
    history: target_function.Events,
    future: target_function.Events,
    attributes: target_function.Attributes,
    ctx: Dict,
) -> np.ndarray | None:
    TARGET_WINDOW_DAYS = 30
    TRANSACTION_DATA_SOURCE = "card_transactions"
    AMOUNT_COLUMN = "amount"

    target_window = timedelta(days=TARGET_WINDOW_DAYS)
    if target_function.has_incomplete_training_window(ctx, target_window):
        return None
    future = future.interval_from(ctx[target_function.SPLIT_TIMESTAMP], target_window)

    dining_transactions = future[TRANSACTION_DATA_SOURCE].filter(
        by="merchant_category",
        condition=lambda x: x == "dining",
    )
    total_spend = dining_transactions.sum(column=AMOUNT_COLUMN)
    return np.array([total_spend], dtype=np.float32)

Exclude inactive cardholders

Skip customers with no historical spend — they may be dormant:

Python
if history[TRANSACTION_DATA_SOURCE].filter(
    by=AMOUNT_COLUMN, condition=lambda x: x < 0
).count() == 0:
    return None
Python
def credit_spend_active_only_target_fn(
    history: target_function.Events,
    future: target_function.Events,
    attributes: target_function.Attributes,
    ctx: Dict,
) -> np.ndarray | None:
    # === Configuration ===
    TARGET_WINDOW_DAYS = 30
    TRANSACTION_DATA_SOURCE = "card_transactions"
    AMOUNT_COLUMN = "amount"

    # Exclude customers with no historical spend
    if history[TRANSACTION_DATA_SOURCE].filter(
        by=AMOUNT_COLUMN, condition=lambda x: x < 0
    ).count() == 0:
        return None

    target_window = timedelta(days=TARGET_WINDOW_DAYS)
    if target_function.has_incomplete_training_window(ctx, target_window):
        return None
    future = future.interval_from(ctx[target_function.SPLIT_TIMESTAMP], target_window)

    spend_transactions = future[TRANSACTION_DATA_SOURCE].filter(
        by=AMOUNT_COLUMN,
        condition=lambda x: x < 0,
    )
    total_spend = spend_transactions.sum(column=AMOUNT_COLUMN)
    return np.array([total_spend], dtype=np.float32)

Metric Why it matters
MAE Average prediction error in the same units as spend — easy to interpret.
RMSE Penalizes large errors more than MAE — useful when outlier accuracy matters.
R² Score How much variance the model explains. Values above 0.5 indicate meaningful predictive power.

Production Tips

  1. Normalize spend by time active. Customers who received their card mid-month will naturally have lower spend. Normalize by the number of active days in the window.

  2. Use predictions for credit limit reviews. Customers predicted to spend significantly more than their limit may benefit from a proactive limit increase.

  3. Segment by card type. Business cards, premium cards, and basic cards have different spending patterns. Consider training separate models or adding card type as a feature.

  4. Watch for seasonality. Holiday spending spikes and summer dips are predictable but real. Retrain monthly to capture seasonal patterns.