Skip to content

Foundation Model API

The foundation model learns general-purpose entity representations from behavioral event data using self-supervised learning. It is configured via a YAML file and trained using the pretrain() function.

pretrain()

Combines behavioral representation fitting and foundation model training into a single call.

from monad.ui import pretrain

pretrain(
    config_path=Path("fm_config.yaml"),
    output_path=Path("./foundation_model"),
)

Signature

def pretrain(
    config_path: Path,
    output_path: Path,
    use_last_basket_sketches: bool = True,
    recency_sketch_timespan_days: Optional[int] = 60,
    storage_config_path: Optional[Path] = None,
    uniqueness_threshold: float = 0.9,
    nan_threshold: float = 0.9,
    sketch_depth: Optional[int] = None,
    sketch_width: Optional[int] = None,
    callbacks: Optional[list[pytorch_lightning.Callback]] = None,
    pl_logger: Optional[pytorch_lightning.loggers.Logger] = None,
    resume: bool = False,
    overwrite: bool = False,
    seed: Optional[int] = None,
) -> None

Parameters

Parameter Type Default Description
config_path Path required Path to the YAML pretraining config file.
output_path Path required Directory where all output artifacts (checkpoints, features, logs) will be stored.
use_last_basket_sketches bool True Whether to add a sketch with event data from the immediate past as a separate input to the model.
recency_sketch_timespan_days int \| None 60 Timespan in days for recency sketches. Set to None to disable recency sketches.
storage_config_path Path \| None None Path to filesystem configuration (for remote storage setups).
uniqueness_threshold float 0.9 Maximum fraction of unique values for a categorical column. Columns above this threshold are excluded.
nan_threshold float 0.9 Maximum fraction of NaN values for a column. Columns above this threshold are excluded.
sketch_depth int \| None None Sketch depth override (for testing). If not set, determined automatically.
sketch_width int \| None None Sketch width override (for testing). If not set, determined automatically.
callbacks list[Callback] \| None None PyTorch Lightning callbacks to attach to the trainer.
pl_logger Logger \| None None PyTorch Lightning logger instance (e.g., MLFlowLogger, TensorBoardLogger).
resume bool False Whether to reuse existing partial results from a previous interrupted run.
overwrite bool False Whether to overwrite all existing data in the output directory and start fresh.
seed int \| None None Random seed for reproducibility. If not provided, reproducibility is not guaranteed.

Two-Stage Functions

The pretrain() function internally runs two stages. You can also call them individually for more control:

fit_behavioral_representation()

First stage: fits the behavioral representation (feature engineering, sketch computation).

from monad.run import fit_behavioral_representation

fit_behavioral_representation(
    config_path=Path("fm_config.yaml"),
    output_path=Path("./foundation_model"),
)

train_foundation_model()

Second stage: trains the neural foundation model using results from the fit stage.

from monad.run import train_foundation_model

train_foundation_model(
    output_path=Path("./foundation_model"),
    callbacks=[...],
    pl_logger=mlflow_logger,
)

Note

train_foundation_model() must be run after fit_behavioral_representation(). Both must use the same output_path.

Output Structure

After successful training, the output directory contains:

foundation_model/
├── checkpoints/
│   └── lightning_checkpoints/
│       └── best.ckpt
├── config.yaml              # Resolved config
├── suggested_config.yaml    # Config with column report suggestions applied
├── features/                # Pre-computed behavioral features
└── logs/                    # Training logs

See Also