monad.ui.pretrain
monad.ui.pretrainmonad.ui.pretrain(config_path, output_path, use_last_basket_sketches=True, recency_sketch_timespan_days=RECENCY_SKETCH_TIMESPAN_DAYS_DEFAULT, storage_config_path=None, nan_threshold=0.9, callbacks=None, pl_logger=None, resume=False, overwrite=False, seed=None)
Validates the configuration, then automatically runs both training stages: it fits the behavioral representation, and finally trains the foundation model using the output from the fitting step.
from monad.ui import pretrain
from pathlib import Path
pretrain(
config_path=Path("path/to/config.yaml"),
output_path=Path("path/to/store/pretrain/artifacts")
)| Parameters |
|---|
config_path : pathlib.Path
Path to YAML configuration file.
output_path : pathlib.Path
Path to store training results.
storage_config_path : Optional[pathlib.Path]
Default: None
File system configuration.
resume : bool
Default: False
If True, training will be resumed from the last checkpoint if such exists, an error will be thrown otherwise.
overwrite : bool
Default: False
If True, any previous training results will be overwritten. Otherwise, if resume is not set and checkpoints from previous training are present, error will be raised.
NoteThe parameters
resumeandoverwritecannot both be set to True. Doing so will raise an error.
callbacks : Optional[list[pytorch_lightning.callbacks.Callback]]
Default: None
List of additional Pytorch Lightning callbacks to add to training.
pl_logger :Optional[pytorch_lightning.loggers.Logger]
Default: None
A logger compatible with PyTorch Lightning, used to record metrics and training progress.
use_last_basket_sketches : bool
Default: True.
Whether to include a sketch of the most recent events as an additional input.
recency_sketch_timespan_days : Optional[int]
_Default: RECENCY_SKETCH_TIMESPAN_DAYS_DEFAULT.
If set, defines the window in days for recency-based sketches. Recency sketches store information about how far in the past the interactions took place.
nan_threshold : float
Default: 0.9
Maximum fraction of missing values allowed in a column to process.
seed: Optional[int]
Default: None
Seed for the training, when provided, ensures reproducibility of the results.
| Returns |
|---|
Saves results under output_path.
