End-to-End Example Configuration

The complete YAML for foundation model training

⚠️

Check This First!

This article refers to BaseModel accessed via Docker container. Please refer to Snowflake Native App section if you are using BaseModel as SF GUI application.


The example below demonstrates the complete flow of the YAML file, from the definition of data sources, through configuration of data loading, model training and control over space and memory:

data_sources:
  - type: main_entity_attribute
    main_entity_column: UserID
    name: customers
    data_location:
    	database_type: snowflake
      connection_params:
        user: username,
        password: strongpassword123,
        account: xy12345.west-europe.azure,
        database: EXAMPLE_DB,
        schema: EXAMPLE_SCHEMA,
      table_name: customers
    disallowed_columns: [CreatedAt]
  - type: event
    main_entity_column: UserID
    name: purchases
    date_column: 
     	name: Timestamp
    data_location:
    	database_type: snowflake
      connection_params:
        user: username,
        password: strongpassword123,
        account: xy12345.west-europe.azure,
        database: EXAMPLE_DB,
        schema: EXAMPLE_SCHEMA,
      table_name: purchases
    where_condition: "Timestamp >= today() - 365"
    sql_lambda: "TO_DOUBLE(price)"

data_params:
  data_start_date: 2022-06-01 00:00:00
  validation_start_date: 2023-06-01 00:00:00
  test_start_date: 2023-07-01 00:00:00
  check_target_for_next_N_days: 7
  
loading_params:
  Train:
    cache_dir: /data/USER/cache/name
  Validation:
    cache_dir: /data/USER/cache/name
  Test:
    cache_dir: /data/USER/cache/name

data_loader_params:
  batch_size: 256
  num_workers: 5

training_params:
  learning_rate: 0.00005
  epochs: 3
  checkpoint_dir: "my_fm/"
  devices: [1]
  
memory_constraining_params:
	hidden_dim: 4096

query_optimization:
  num_query_chunks: 4
  num_workers: 10

With the file completed, you are now ready to run the training like described in here.