Fine-tuning training parameters

⚠️
Check This First!
This article refers to BaseModel accessed via Docker container. Please refer to Snowflake Native App section if you are using BaseModel as SF GUI application.

To configure scenario model training, you need to define your training parameters using the TrainingParams object. At a minimum, you should specify where to save the model by setting the checkpoint_dir parameter—unless you’re fine with the model not being stored at all.

All parameters have sensible default values, but you can override any of them by providing custom values in the training_params. These include all constructor parameters supported by the PyTorch Lightning Trainer. For the complete list of configurable options, see this article.

Example

The example provided below demonstrates training parameters. In addition to specifying the scenario model location, some parameters have been overwritten or added.

training_params = TrainingParams(
    checkpoint_dir='/location/to/save/your/scenario/model'
    epochs=1,
    learning_rate=0.0001,
    overwrite=True,
    devices=[2],
    limit_train_batches=10
)

Callbacks

A Callback is an extension that can be used to supplement training with additional functionalities. Please refer to PyTorch documentation for more details.

Example

The example provided below demonstrates how to passTQDMProgressBar callback to training parameters.

from pytorch_lightning.callbacks import TQDMProgressBar

training_params = TrainingParams(
    checkpoint_dir='/location/to/save/your/scenario/model'
    epochs=1,
    callbacks=[TQDMProgressBar(refresh_rate=100)],
)

Metrics

Downstream models can be validated with any metric from TorchMetrics .

The table below contains default values of metric parameters for each downstream task.

Task	`metrics`	`metric_to_monitor`	`metric_monitoring_mode`
Binary Classification	`AUROC(` `num_labels=num_classes,` `task="binary",` `average=None,` `)`, `AveragePrecision(` `num_labels=num_classes,` `task="binary",` `average=None,` `)`	val_auroc_0	`MetricMonitoringMode.MAX`
Multiclass Classification	`Precision(` `num_classes=num_classes`, `task="multiclass",` `average=None,` `)`, `Recall(` `num_classes=num_classes,` `task="multiclass",` `average=None,` `)`	val_precision_0	`MetricMonitoringMode.MAX`
Multi-label Classification	`AUROC(` `num_labels=num_classes,` `task="multilabel",` `average=None,` `)`, `AveragePrecision(` `num_labels=num_classes,` `task="multilabel",` `average=None,` )`	val_auroc_0	`MetricMonitoringMode.MAX`
Regression	`MeanSquaredError(squared=False)`	val_loss	`MetricMonitoringMode.MIN`
Recommendations	`HitRateAtK(k=1)`, `HitRateAtK(k=10)`, `HitRateAtK(k=25)`, `HitRateAtK(k=50)`, `HitRateAtK(k=100)`, `HitRateAtK(k=top_k)`, `MeanAveragePrecisionAtK(k=12)`, `MeanAveragePrecisionAtK(` `k=top_k` `)`, `PrecisionAtK(k=10)`, `PrecisionAtK(k=top_k)` where `top_k` defaults to 12.	val_HR@10_0	`MetricMonitoringMode.MAX`

Example

The example below demonstrates how to use recall and AUC during validation and how to monitor recall when selecting the best epoch.

training_params = TrainingParams(
    checkpoint_dir='/location/to/save/your/scenario/model'
    epochs=1,
    metrics=[
           {"alias": "auroc", "metric_name": "AUROC", "kwargs": {"task": "binary", "average": None}},
           {"alias": "recall", "metric_name": "Recall", "kwargs": {"task": "binary"}},
       ],
     metric_to_monitor="val_recall_0",
     metric_monitoring_mode=MetricMonitoringMode.MAX
)

Loss Function

The default loss functions are proprietary extension of cross-entropy tailored to provide optimal training. Nevertheless, the loss function can be changed by either function imported from PyTorch loss functions or defined in Python.

Example

The examples provided below demonstrates how to change loss function in training parameters.

from torch.nn.functional import mse_loss

training_params = TrainingParams(
    checkpoint_dir='/location/to/save/your/scenario/model'
    epochs=1,
    loss=mse_loss,
)

from torch.nn.functional import binary_cross_entropy_with_logits

    def weighted_binary_cross_entropy_with_logits(
        input, target, weight=None, size_average=None, reduce=None, reduction="mean"
    ):
        return binary_cross_entropy_with_logits(
          input, 
          target,
          weight, 
          size_average, 
          reduce, 
          reduction, 
          pos_weight=torch.tensor([0.9], device="cuda:0")
    )

training_params = TrainingParams(
    checkpoint_dir='/location/to/save/your/scenario/model'
    epochs=1,
    loss=weighted_binary_cross_entropy_with_logits,
)

⚠️
Important!
If you use custom loss function, remember to define it in all scripts that load trained model with the load_from_checkpoint function.