Skip to content

Requirements

Hardware

Component Minimum
GPU recommended NVIDIA A100 or better; CUDA 12+
at minimum a multi-GPU cluster of A10s / L40s; CUDA 12+
RAM 240 GB
CPU 32 cores
Disk 1 TB
Runtime Docker-capable environment

Training and inference scale linearly with the number of GPUs.

Data

Minimum Structure

You need at least one event data source with:

Column Description
Entity ID Unique identifier, e.g. customer_id
Timestamp When the event occurred
Event attributes (min. 1) e.g. product_id, price, category

Adding entity attributes is recommended but not required:

Example table Example columns
Customer attributes customer_id, age, region, signup_date, segment
Item attributes product_id, category, brand, price_tier, product_name
Store attributes store_id, format, region, zip_code, city

Volume Guidelines

Requirement Threshold
Unique main entities ≥ 10 000 (e.g. customers, users)
Event volume ≥ 100 000 interactions per month
History — frequent interactions (banking, telco, FMCG, …) ≥ 3 months
History — infrequent interactions (fashion, insurance, automotive, …) ≥ 1 year

Supported Data Sources

  • Snowflake
  • BigQuery
  • Azure Synapse
  • Parquet
  • Databricks
  • Hive
  • ClickHouse

See Data Connect Sources for connection details.

Example Performance Profile

Real-world retail customer benchmark:

Metric Value
Events ~8 billion
Unique clients ~18 million
Unique products ~1 million
Foundation model training 12 h on 1× NVIDIA A100
Scenario fine-tuning (16 000 brands) 10 h on 1× NVIDIA A100
Inference throughput 2 718 clients/sec per GPU

Additional optimizations (Low‑Rank Adapter Tuning, model quantization) can further reduce cost at a slight quality trade-off.