⚠️
Check This First!
This article refers to BaseModel accessed via Docker container. Please refer to Snowflake Native App section if you are using BaseModel as SF GUI application.

Once you have successfully configured your data and model parameters in the YAML file, you are ready to train your foundation model! With BaseModel implemented as a Docker container, you can run the training in one of two ways:

Run both training stages in a joint pipeline using the pretrain function.
Run training in separate stages, which may be helpful—for example—in a dual-environment setup. In this case, you will use two functions: fit_behavioral_representation and train_foundation_model.

Both options can be be executed either in Python or via the command line.

Option 1: Run the training as a joint pipeline

This is the simplest way to train your foundation model in a single step. The pretrain function first validates your configuration, then automatically runs both training stages: it fits the behavioral representation, and finally trains the foundation model using the output from the fitting step.

Using Python script

Here’s how a script executing pretrain function looks in its simplest Python form:

from monad.ui import pretrain
from pathlib import Path

pretrain(
    config_path=Path("path/to/config.yaml"), 
    output_path=Path("path/to/store/pretrain/artifacts")
)

Parameters

config_path : str
Required. No default.
Path to YAML configuration file.
output_path : str
Required. No default.
Path to store training results.
storage_config_path : str
Optional. Default: None
File system configuration.
resume : boolean
Optional. Default: False
If True, training will be resumed from the last checkpoint if such exists, an error will be thrown otherwise.
overwrite : boolean
Optional. Default: False
If True, any previous training results will be overwritten. Otherwise, if resume is not set and checkpoints from previous training are present, error will be raised.
callbacks : list[Callback]
Optional. Default: Lightning factory default
List of additional Pytorch Lightning callbacks to add to training.
pl_logger : instance of pytorch_lightning.loggers.Logger class
Optional. Default: None
A logger compatible with PyTorch Lightning, used to record metrics and training progress.
use_last_basket_sketches : bool
Optional. Default: True.
Whether to include a sketch of the most recent events as an additional input.
recency_sketch_timespan_days : int | None
Optional. Default: system constant.
If set, defines the window in days for recency-based sketches. Recency sketches store information about how far in the past the interactions took place.
nan_threshold : float
Required. Default: 0.9
Maximum fraction of missing values allowed in a column to process.
seed: int
Optional. Default: None Seed for the training, when provided, ensures reproducibility of the results.

❗️
Note
The parameters resume and overwrite cannot both be set to True. Doing so will raise an error.

⚠️
Note
Parameters provided here override those defined in the YAML config.

Executing in command line

The example below demonstrates how to run the pretrain function from the command line. You can use the same parameters as above by adding them after --, with arguments enclosed in double quotes. The only exceptions are pl_logger and callbacks, which must be configured in Python.

python -m monad.run \
--pretrain \
--config-path "path/to/config.yml" \
--features-path "path/to/store/pretrain/artifacts" \
--overwrite

❗️
Note
The parameters resume and overwrite cannot both be set to True. Doing so will raise an error.

⚠️
Note
Parameters provided here override those defined in the YAML config.

Option 2: Run the training in modular stages

The modular training pipeline splits the process into two separate stages: one for fitting the behavioral representation, and another for training the foundation model. This is ideal when running the two stages in different environments (e.g., one with more CPU and RAM for fitting, and one with GPU enabled for training).

Stage 1: Fit behavioral representation

This stage analyzes your data and builds the feature representation needed for model training.

Using Python script

Here’s how a script executing fit_behavioral_representation function looks in its simplest Python form.

from monad.ui import fit_behavioral_representation
from pathlib import Path

fit_behavioral_representation(
    config_path=Path("path/to/config.yaml"),
    output_path=Path("path/to/store/pretrain/artifacts")
)

Parameters

config_path : str
Required. No default.
Path to YAML configuration file.
output_path : str
Required. No default.
Path to store training results.
storage_config_path : str
Optional. Default: None
File system configuration.
resume : boolean
Optional. Default: False
If True, training will be resumed from the last checkpoint if such exists, an error will be thrown otherwise.
overwrite : boolean
Optional. Default: False
If True, any previous training results will be overwritten. Otherwise, if resume is not set and checkpoints from previous training are present, error will be raised.
nan_threshold : float
Required. Default: 0.9
Maximum fraction of missing values allowed in a column to process.
seed: int
Optional. Default: None Seed for the training, when provided, ensures reproducibility of the results.

❗️
Note
The parameters resume and overwrite cannot both be set to True. Doing so will raise an error.

⚠️
Note
Parameters provided here override those defined in the YAML config.

Executing in command line

The example below demonstrates how you can call the fit_behavioral_representation function from the command line. You can use the same parameters as above by adding them after --, with arguments enclosed in double quotes.

python -m monad.run \
--fit \
--config-path "path/to/config.yml" \
--features-path "path/to/store/pretrain/artifacts" \
--overwrite

❗️
Note
The parameters resume and overwrite cannot both be set to True. Doing so will raise an error.

⚠️
Note
Parameters provided here override those defined in the YAML config.

Stage 2: Train foundation model

This stage trains the foundation model using intermediate outputs from the fitting step.

Using Python script

Here’s how a script executing train_foundation_model function looks in its simplest Python form.

from monad.ui import train_foundation_model
from pathlib import Path

train_foundation_model(
    output_path=Path("path/to/store/pretrain/artifacts")
)

Parameters

output_path : str
Required. No default.
Path to store training results.
storage_config_path : str
Optional. Default: None
File system configuration.
resume : boolean
Optional. Default: False
If True, training will be resumed from the last checkpoint if such exists, an error will be thrown otherwise.
overwrite : boolean
Optional. Default: False
If True, any previous training results will be overwritten. Otherwise, if resume is not set and checkpoints from previous training are present, error will be raised.
callbacks : list[Callback]
Optional. Default: Lightning factory default
List of additional Pytorch Lightning callbacks to add to training.
pl_logger : str
Optional. Default: None
PyTorch Lightning-compatible logger to use.
use_last_basket_sketches : bool
Optional. Default: True
Whether to include a sketch of the most recent events as an additional input.
recency_sketch_timespan_days : int | None
Optional. Default: system constant
If set, defines the window in days for recency-based sketches.
seed: int
Optional. Default: None Seed for the training, when provided, ensures reproducibility of the results.

❗️
Note
The parameters resume and overwrite cannot both be set to True. Doing so will raise an error.

⚠️
Note
Parameters provided here override those defined in the YAML config.

Executing in command line

The example below demonstrates how to run the train_foundation_model function from the command line.

Please note that config-path should no longer be provided; BaseModel will use the configuration file stored at the fitting stage.
You can use the same parameters as above by adding them after --, with arguments enclosed in double quotes. The only exceptions are pl_logger and callbacks, which must be configured in Python.

python -m monad.run \
--fm \
--features-path "path/to/store/pretrain/artifacts" \
--overwrite

❗️
Note
The parameters resume and overwrite cannot both be set to True. Doing so will raise an error.

⚠️
Note
Parameters provided here override those defined in the YAML config.

End of Foundation Model training

Training is complete when:

Console output confirms that model checkpoints have been saved.
A _FINISHED folder appears in the location specified by your output_path, containing the best model.