Training the Model
How to run the feature preprocessing and foundation model training with Python script or command line
Check This First!
This article refers to BaseModel accessed via Docker container. Please refer to Snowflake Native App section if you are using BaseModel as SF GUI application.
Once you have successfully configured your data and model parameters in the YAML file, you are ready to train your foundation model! With BaseModel implemented as a Docker container, you can run the training in one of two ways:
- Run both training stages in a joint pipeline using the
pretrain
function. - Run training in separate stages, which may be helpful—for example—in a dual-environment setup. In this case, you will use two functions:
fit_behavioral_representation
andtrain_foundation_model
.
Both options can be be executed either in Python or via the command line.
Option 1: Run the training as a joint pipeline
This is the simplest way to train your foundation model in a single step. The pretrain
function first validates your configuration, then automatically runs both training stages: it fits the behavioral representation, and finally trains the foundation model using the output from the fitting step.
Using Python script
Here’s how a script executing pretrain
function looks in its simplest Python form:
from monad.ui import pretrain
from pathlib import Path
pretrain(
config_path=Path("path/to/config.yaml"),
output_path=Path("path/to/store/pretrain/artifacts")
)
Parameters |
---|
- config_path : str
Required. No default.
Path to YAML configuration file. - output_path : str
Required. No default.
Path to store training results. - storage_config_path : str
Optional. Default: None
File system configuration. - resume : boolean
Optional. Default: False
If True, training will be resumed from the last checkpoint if such exists, an error will be thrown otherwise. - overwrite : boolean
Optional. Default: False
If True, any previous training results will be overwritten. Otherwise, if resume is not set and checkpoints from previous training are present, error will be raised. - callbacks : list[Callback]
Optional. Default: Lightning factory default
List of additional Pytorch Lightning callbacks to add to training. - pl_logger : instance of
pytorch_lightning.loggers.Logger
class
Optional. Default: None
A logger compatible with PyTorch Lightning, used to record metrics and training progress. - use_last_basket_sketches : bool
Optional. Default: True.
Whether to include a sketch of the most recent events as an additional input. - recency_sketch_timespan_days : int | None
Optional. Default: system constant.
If set, defines the window in days for recency-based sketches. Recency sketches store information about how far in the past the interactions took place. - nan_threshold : float
Required. Default: 0.9
Maximum fraction of missing values allowed in a column to process.
Note
The parameters
resume
andoverwrite
cannot both be set to True. Doing so will raise an error.
Note
Parameters provided here override those defined in the YAML config.
Executing in command line
The example below demonstrates how to run the pretrain
function from the command line. You can use the same parameters as above by adding them after --
, with arguments enclosed in double quotes. The only exceptions are pl_logger and callbacks, which must be configured in Python.
python -m monad.run \
--pretrain \
--config-path "path/to/config.yml" \
--features-path "path/to/store/pretrain/artifacts" \
--overwrite
Note
The parameters
resume
andoverwrite
cannot both be set to True. Doing so will raise an error.
Note
Parameters provided here override those defined in the YAML config.
Option 2: Run the training in modular stages
The modular training pipeline splits the process into two separate stages: one for fitting the behavioral representation, and another for training the foundation model. This is ideal when running the two stages in different environments (e.g., one with more CPU and RAM for fitting, and one with GPU enabled for training).
Stage 1: Fit behavioral representation
This stage analyzes your data and builds the feature representation needed for model training.
Using Python script
Here’s how a script executing fit_behavioral_representation
function looks in its simplest Python form.
from monad.ui import fit_behavioral_representation
from pathlib import Path
fit_behavioral_representation(
config_path=Path("path/to/config.yaml"),
output_path=Path("path/to/store/pretrain/artifacts")
)
Parameters |
---|
- config_path : str
Required. No default.
Path to YAML configuration file. - output_path : str
Required. No default.
Path to store training results. - storage_config_path : str
Optional. Default: None
File system configuration. - resume : boolean
Optional. Default: False
If True, training will be resumed from the last checkpoint if such exists, an error will be thrown otherwise. - overwrite : boolean
Optional. Default: False
If True, any previous training results will be overwritten. Otherwise, if resume is not set and checkpoints from previous training are present, error will be raised. - nan_threshold : float
Required. Default: 0.9
Maximum fraction of missing values allowed in a column to process.
Note
The parameters
resume
andoverwrite
cannot both be set to True. Doing so will raise an error.
Note
Parameters provided here override those defined in the YAML config.
Executing in command line
The example below demonstrates how you can call the fit_behavioral_representation
function from the command line. You can use the same parameters as above by adding them after --
, with arguments enclosed in double quotes.
python -m monad.run \
--fit \
--config-path "path/to/config.yml" \
--features-path "path/to/store/pretrain/artifacts" \
--overwrite
Note
The parameters
resume
andoverwrite
cannot both be set to True. Doing so will raise an error.
Note
Parameters provided here override those defined in the YAML config.
Stage 2: Train foundation model
This stage trains the foundation model using intermediate outputs from the fitting step.
Using Python script
Here’s how a script executing train_foundation_model
function looks in its simplest Python form.
from monad.ui import train_foundation_model
from pathlib import Path
train_foundation_model(
output_path=Path("path/to/store/pretrain/artifacts")
)
Parameters |
---|
- output_path : str
Required. No default.
Path to store training results. - storage_config_path : str
Optional. Default: None
File system configuration. - resume : boolean
Optional. Default: False
If True, training will be resumed from the last checkpoint if such exists, an error will be thrown otherwise. - overwrite : boolean
Optional. Default: False
If True, any previous training results will be overwritten. Otherwise, if resume is not set and checkpoints from previous training are present, error will be raised. - callbacks : list[Callback]
Optional. Default: Lightning factory default
List of additional Pytorch Lightning callbacks to add to training. - pl_logger : str
Optional. Default: None
PyTorch Lightning-compatible logger to use. - use_last_basket_sketches : bool
Optional. Default: True
Whether to include a sketch of the most recent events as an additional input. - recency_sketch_timespan_days : int | None
Optional. Default: system constant
If set, defines the window in days for recency-based sketches.
Note
The parameters
resume
andoverwrite
cannot both be set to True. Doing so will raise an error.
Note
Parameters provided here override those defined in the YAML config.
Executing in command line
The example below demonstrates how to run the train_foundation_model
function from the command line.
- Please note that config-path should no longer be provided; BaseModel will use the configuration file stored at the fitting stage.
- You can use the same parameters as above by adding them after
--
, with arguments enclosed in double quotes. The only exceptions are pl_logger and callbacks, which must be configured in Python.
python -m monad.run \
--fm \
--features-path "path/to/store/pretrain/artifacts" \
--overwrite
Note
The parameters
resume
andoverwrite
cannot both be set to True. Doing so will raise an error.
Note
Parameters provided here override those defined in the YAML config.
End of Foundation Model training
Training is complete when:
- Console output confirms that model checkpoints have been saved.
- A
_FINISHED
folder appears in the location specified by youroutput_path
, containing the best model.
Updated 18 days ago