Control of data loading process
data_loader_params
blocks in YAML
configuration file
Check This First!
This article refers to BaseModel accessed via Docker container. Please refer to Snowflake Native App section if you are using BaseModel as SF GUI application.
data_loader_params
block allows you to set constructor parameters for PyTorch DataLoader.
These settings modify how the data is loaded, such as batch sizes, workers etc.
Parameters |
---|
- batch_size : int
default: 256
The size of the batch: how many samples per batch to load. - num_workers : int
default: 0
How many sub-processes to use for data loading. 0 means that the data will be loaded in the main process. Increasing number of workers results in splitting queries into smaller pieces which reduce memory consumption on the database end. - pin_memory : boolean
default: False
If True, the data loader will copy Tensors into device/CUDA pinned memory before returning them. - drop_last : boolean
default: False
Set to True to drop the last incomplete batch if the dataset size is not divisible by. - pin_memory_device : str
default: None
The device memory should be pinned to, ifpin_memory
isTrue
. - prefetch_factor : int
default: 2
Number of batches loaded in advance by each worker.
Example |
---|
data_loader_params:
batch_size: 256
num_workers: 5
Updated 23 days ago