Managing space and memory

memory_constraining_params and query_optimization blocks in YAML configuration file

⚠️
Check This First!
This article refers to BaseModel accessed via Docker container. Please refer to Snowflake Native App section if you are using BaseModel as SF GUI application.

Controlling the model size

Parameters in memory_constraining_params block determine the size of the model:

explicitly by setting model architecture,
implicitly by steering the size of the input.

Depending on your infrastructure and data, you may want to reduce the model size or, conversely, increase it.

Parameters

hidden_dim : int
default: NUM_LAYERS_DEFAULT
The size of the hidden layers. NUM_LAYERS_DEFAULT constant sets it to 2048.
num_layers : int
default: NUM_LAYERS_DEFAULT
The number of hidden layers. NUM_LAYERS_DEFAULT constant sets it to 4.
emde_quality : float
default: 1.0
The quality of the features' density estimation. The lower the quality, the smaller the sketches and therefore the input to the model.

Example

hidden_dim: 1024
emde_quality: 0.8

Optimizing query

The settings here control the degree of parallelization.

Parameters

num_query_chunks : int, optional
default: 1
This parameter represents the number of segments a query should be divided into. Splitting the query into smaller pieces can help reduce memory consumption on the database end, which is particularly useful for queries that require significant memory resources.\
num_cpus : int > 0, optional
default: 4
The number of CPUs used at the start of the pretrain phase. Helps to limit CPU utilization.
num_concurrent_features : int > 0, optional
default: 4
This parameter specifies the number of columns processed concurrently at the start of the pretrain phase. Processing fewer columns simultaneously reduces memory consumption but increases computation time.
Note: The maximum number of processes used will be calculated as num_concurrent_features * num_cpus.

Example

query_optimization:
  num_query_chunks: 4
  num_cpus: 10

Updated 18 days ago