HomeGuidesRecipesChangelog
Log In
Guides

Training the Model

How to run the feature preprocessing and foundation model training with Python script or command line

⚠️

Check This First!

This article refers to BaseModel accessed via Docker container. Please refer to Snowflake Native App section if you are using BaseModel as SF GUI application.


Once you have successfully configured your data and model parameters in the YAML file, you are ready to train your foundation model! With BaseModel implemented as a Docker container, you can run the training in one of two ways:

  • Run both training stages in a joint pipeline using the pretrain function.
  • Run training in separate stages, which may be helpful—for example—in a dual-environment setup. In this case, you will use two functions: fit_behavioral_representation and train_foundation_model.

Both options can be be executed either in Python or via the command line.

Option 1: Run the training as a joint pipeline

This is the simplest way to train your foundation model in a single step. The pretrain function first validates your configuration, then automatically runs both training stages: it fits the behavioral representation, and finally trains the foundation model using the output from the fitting step.

Using Python script

Here’s how a script executing pretrain function looks in its simplest Python form:

from monad.ui import pretrain
from pathlib import Path

pretrain(
    config_path=Path("path/to/config.yaml"), 
    output_path=Path("path/to/store/pretrain/artifacts")
)
Parameters
  • config_path : str
    Required. No default.
    Path to YAML configuration file.
  • output_path : str
    Required. No default.
    Path to store training results.
  • storage_config_path : str
    Optional. Default: None
    File system configuration.
  • resume : boolean
    Optional. Default: False
    If True, training will be resumed from the last checkpoint if such exists, an error will be thrown otherwise.
  • overwrite : boolean
    Optional. Default: False
    If True, any previous training results will be overwritten. Otherwise, if resume is not set and checkpoints from previous training are present, error will be raised.
  • callbacks : list[Callback]
    Optional. Default: Lightning factory default
    List of additional Pytorch Lightning callbacks to add to training.
  • pl_logger : instance of pytorch_lightning.loggers.Logger class
    Optional. Default: None
    A logger compatible with PyTorch Lightning, used to record metrics and training progress.
  • use_last_basket_sketches : bool
    Optional. Default: True.
    Whether to include a sketch of the most recent events as an additional input.
  • recency_sketch_timespan_days : int | None
    Optional. Default: system constant.
    If set, defines the window in days for recency-based sketches. Recency sketches store information about how far in the past the interactions took place.
  • nan_threshold : float
    Required. Default: 0.9
    Maximum fraction of missing values allowed in a column to process.

❗️

Note

The parameters resume and overwrite cannot both be set to True. Doing so will raise an error.

⚠️

Note

Parameters provided here override those defined in the YAML config.

Executing in command line

The example below demonstrates how to run the pretrain function from the command line. You can use the same parameters as above by adding them after --, with arguments enclosed in double quotes. The only exceptions are pl_logger and callbacks, which must be configured in Python.

python -m monad.run \
--pretrain \
--config-path "path/to/config.yml" \
--features-path "path/to/store/pretrain/artifacts" \
--overwrite

❗️

Note

The parameters resume and overwrite cannot both be set to True. Doing so will raise an error.

⚠️

Note

Parameters provided here override those defined in the YAML config.

Option 2: Run the training in modular stages

The modular training pipeline splits the process into two separate stages: one for fitting the behavioral representation, and another for training the foundation model. This is ideal when running the two stages in different environments (e.g., one with more CPU and RAM for fitting, and one with GPU enabled for training).

Stage 1: Fit behavioral representation

This stage analyzes your data and builds the feature representation needed for model training.

Using Python script

Here’s how a script executing fit_behavioral_representation function looks in its simplest Python form.

from monad.ui import fit_behavioral_representation
from pathlib import Path

fit_behavioral_representation(
    config_path=Path("path/to/config.yaml"),
    output_path=Path("path/to/store/pretrain/artifacts")
)
Parameters

  • config_path : str
    Required. No default.
    Path to YAML configuration file.
  • output_path : str
    Required. No default.
    Path to store training results.
  • storage_config_path : str
    Optional. Default: None
    File system configuration.
  • resume : boolean
    Optional. Default: False
    If True, training will be resumed from the last checkpoint if such exists, an error will be thrown otherwise.
  • overwrite : boolean
    Optional. Default: False
    If True, any previous training results will be overwritten. Otherwise, if resume is not set and checkpoints from previous training are present, error will be raised.
  • nan_threshold : float
    Required. Default: 0.9
    Maximum fraction of missing values allowed in a column to process.

❗️

Note

The parameters resume and overwrite cannot both be set to True. Doing so will raise an error.

⚠️

Note

Parameters provided here override those defined in the YAML config.

Executing in command line

The example below demonstrates how you can call the fit_behavioral_representation function from the command line. You can use the same parameters as above by adding them after --, with arguments enclosed in double quotes.

python -m monad.run \
--fit \
--config-path "path/to/config.yml" \
--features-path "path/to/store/pretrain/artifacts" \
--overwrite

❗️

Note

The parameters resume and overwrite cannot both be set to True. Doing so will raise an error.

⚠️

Note

Parameters provided here override those defined in the YAML config.

Stage 2: Train foundation model

This stage trains the foundation model using intermediate outputs from the fitting step.

Using Python script

Here’s how a script executing train_foundation_model function looks in its simplest Python form.

from monad.ui import train_foundation_model
from pathlib import Path

train_foundation_model(
    output_path=Path("path/to/store/pretrain/artifacts")
)
Parameters
  • output_path : str
    Required. No default.
    Path to store training results.
  • storage_config_path : str
    Optional. Default: None
    File system configuration.
  • resume : boolean
    Optional. Default: False
    If True, training will be resumed from the last checkpoint if such exists, an error will be thrown otherwise.
  • overwrite : boolean
    Optional. Default: False
    If True, any previous training results will be overwritten. Otherwise, if resume is not set and checkpoints from previous training are present, error will be raised.
  • callbacks : list[Callback]
    Optional. Default: Lightning factory default
    List of additional Pytorch Lightning callbacks to add to training.
  • pl_logger : str
    Optional. Default: None
    PyTorch Lightning-compatible logger to use.
  • use_last_basket_sketches : bool
    Optional. Default: True
    Whether to include a sketch of the most recent events as an additional input.
  • recency_sketch_timespan_days : int | None
    Optional. Default: system constant
    If set, defines the window in days for recency-based sketches.

❗️

Note

The parameters resume and overwrite cannot both be set to True. Doing so will raise an error.

⚠️

Note

Parameters provided here override those defined in the YAML config.

Executing in command line

The example below demonstrates how to run the train_foundation_model function from the command line.

  • Please note that config-path should no longer be provided; BaseModel will use the configuration file stored at the fitting stage.
  • You can use the same parameters as above by adding them after --, with arguments enclosed in double quotes. The only exceptions are pl_logger and callbacks, which must be configured in Python.
python -m monad.run \
--fm \
--features-path "path/to/store/pretrain/artifacts" \
--overwrite

❗️

Note

The parameters resume and overwrite cannot both be set to True. Doing so will raise an error.

⚠️

Note

Parameters provided here override those defined in the YAML config.

End of Foundation Model training

Training is complete when:

  • Console output confirms that model checkpoints have been saved.
  • A _FINISHED folder appears in the location specified by your output_path, containing the best model.