Frequently asked questions
Everything you wanted to know about BaseModel — in one place
About basemodel.ai
What is BaseModel.ai?
BaseModel is a framework for constructing private foundation models and downstream scenario-specific models trained on your organization’s behavioral data. It automatically learns universal behavioral representations from event streams, then fine-tunes supervised models for specific prediction goals.
See About BaseModel section for more details.
What do I need to start working with BaseModel.ai?
To start working with BaseModel, you need three things:
-
Behavioral data that meets the minimum quality and volume requirements (see Data requirements below),
-
Infrastructure where a Docker container can be deployed with GPU access meeting BaseModel’s hardware specifications, and
-
A data scientist or ML engineer able to configure data sources, define prediction objectives, and operate the training process. You may also need additional roles or skills during the initial deployment, along with support from business stakeholders to define prediction objectives and evaluate results effectively.
See the Requirements section to learn more.
What skills or roles are necessary to deploy and run BaseModel?
You definitely need a Data Scientist (or more) to work with BaseModel. Depending on the structure and responsibilities in your organisation, they may need the support of:
- DevOps/Infra Engineer – deploy Docker container to a GPU-equipped VM or Kubernetes cluster,
- Data Engineer – grant read-only warehouse access and configure data sources,
- Analyst/Product Owner – consult on business objectives and prediction targets definition.
See Requirements, Data Model and Sources and Defining the Task and Target of the Model sections for more details on deployment, data sources and model targets respectively.
Can BaseModel be used without event data?
No. BaseModel requires timestamped event streams linked to identifiable entities. If only aggregate, static data is available, simpler models like GBDTs are sufficient.
See the Requirements section to learn more.
How difficult is deployment?
BaseModel ships as a self-contained Docker container for on-prem or cloud environments. If you already use Docker or Kubernetes, setup is straightforward.
See the Requirements section to learn more.
Does BaseModel send or copy my data anywhere?
No. All computation happens inside your infrastructure, reading directly from your data warehouse.
Data & connectivity
Which types of events are most often used?
BaseModel works with any event-based behavioral data that reflects interactions between entities and their environment. Typical examples include:
- Retail: product views, cart additions, transactions, returns, loyalty interactions, campaign responses.
- Banking: account transactions, card transactions, credit applications, investments, customer support interactions.
- Telecommunications: calls, messages, data usage, service activations, plan changes, churn-related events. Any recurring, timestamped event tied to an entity ID (customer, device, account, etc.) can serve as an input.
See the Data Model and Sources for more details.
How much data do I need, and how long a history?
You need at least:
- one event data source (entity id, timestamp, attributes),
- 100k+ behavioral interactions per month, and
- 10k+ distinct behavioral profiles (customers, members, devices, etc.).
For high-frequency domains (banking, telco), 3 months of data may suffice; for low-frequency ones (fashion, insurance), aim for at least one year.
See the Requirements section for more information.
What data formats and schema are supported?
The requirements for BaseModel to consume the data are:
- All event data must include at least timestamp and entity_id fields.
- Attribute tables must be joinable on the same entity ID as in the event tables.
- Both event and attribute data must be accessible as tables or views in supported database engines (e.g., BigQuery, Snowflake, Redshift, PostgreSQL, MSSQL) or stored as Parquet files.
See the Data Model and Sources for more details.
Can I add aggregated or static features?
Yes, in the following way:
- Temporally aggregated features (e.g., weekly purchase counts, average spend per month) should be treated as timestamped events.
- Static features (e.g., demographics, product attributes, metadata) should be included as entity attributes.
See the Data Model and Sources for more details.
How can I improve data quality?
Good data quality ensures smooth ingestion and reliable training. Make sure to:
- Ensure consistent timestamps and entity IDs.
- Remove duplicate columns (those mapping 1:1 with another field).
- Eliminate invalid or redundant categorical values.
- Enrich events with relevant metadata or contextual features.
- Review the log of foundation model's fit stage for any warnings or suggestions.
How are missing values handled?
BaseModel does not impute missing values automatically. Columns with missing data exceeding a defined threshold are ignored during training. Review the log of foundation model's fit stage for any warnings or suggestions - you can then prefill or drop them depending on your use case.
How are timezones processed?
BaseModel handles timestamps in the following way:
- All dates are converted to UTC during preprocessing.
- If a timezone is specified, BaseModel translates timestamps to UTC.
- If no timezone is provided, timestamps are interpreted as UTC.
- If the database engine does not support timezones, BaseModel drops tzinfo and preserves wall-time.
How do I prevent data leakage?
Training, validation, and test data are separated strictly by design to ensure temporal integrity. You must split data chronologically, keeping all inference and test examples after the validation end date. If this rule is violated, BaseModel raises an explicit error.
See Training, validation, and testing sets article for more details.
Does BaseModel handle text or image data?
Yes. You can specify the type of each column (e.g., text, time-series). When set to text or image, BaseModel automatically produces embeddings using built-in encoders. External pretrained embeddings are also supported.
Training & optimization
How do I troubleshoot target function problems?
It’s good practice to run verify_target() on your new target function to make sure that there are no runtime or data type errors, or that the percentage of None values remains under acceptable threshold. If training fails or performance metrics behave unexpectedly, review label integrity, examine the distribution of labels or regression targets to detect imbalance or outliers, and inspect logs for warnings.
See the Validating the Target Function guide for more information.
How is model performance evaluated?
Model evaluation is performed using the test() method, which returns predictions and ground truth in the desired format, along with task-specific default metrics (e.g., AUC, accuracy, RMSE). You can add additional metrics from TorchMetrics repertoire or your own custom metric functions.
See Testing scenario model article to learn more.
How do I decide when to stop training?
Training can either continue for a fixed number of epochs or stop automatically using configurable early stopping criteria. It’s recommended to track metrics using a logger such as Neptune or an equivalent tool to monitor loss trends and convergence, helping you decide when to stop training for best generalization.
See Model training configuration guide for details.
Can BaseModel handle class imbalance?
BaseModel does not rebalance data automatically. However, it is designed to learn effectively from large and naturally imbalanced datasets, as its representation learning captures broad behavioral patterns across entities and interactions. When needed, users can apply data sampling or loss weighting techniques to specific training jobs and compare performance improvements using validation metrics.
How can I make training faster?
To optimize performance, follow these steps in the recommended order:
-
Optimize data – Use interpret() to identify which columns contribute most to the target, remove duplicates, and experiment with shorter time windows to speed up iterations.
-
Parallelize – Scale across multiple GPUs, tune the number of workers, and enable concurrent feature processing for efficient resource use.
-
Enable caching to reuse preprocessed batches between runs and reduce I/O overhead.
-
As a last resort, reduce graph quality or neural network size to shorten training time, but be mindful that this can lead to performance degradation.
Consult relevant guides in Configuring Parameters section for more specific information.
Can I perform hyperparameter tuning?
There is no standalone tuning interface in the configuration. BaseModel’s optimizer adjusts parameters automatically for stable convergence. However, you can perform systematic experiments and analyze results using your preferred logger (e.g., MLFlow, Neptune) to explore optimal hyperparameter settings.
Can BaseModel forecast further into the future (beyond today)?
Yes — but the target function must define a prediction interval relative to the chosen prediction date. For example, you can predict churn between 30 and 90 days from the prediction date, or a purchase within 14 to 28 days. The prediction date itself must not be set in the future, i.e., there must be no gap between when inference is executed and the last available data in your event streams.
Science & architecture
Why is BaseModel different from GBDT-based approaches?
GBDT models (XGBoost, CatBoost, LightGBM, etc.) rely on static, hand-engineered features and struggle with temporal, multi-relational, or multi-modal data. BaseModel directly models event graphs and temporal dependencies, generating dynamic embeddings that evolve with behavior. It automatically handles feature creation, aggregation, and encoding, reducing maintenance and improving generalization.
See Temporal data splits to learn more.
Is BaseModel a transformer or a generative model?
BaseModel follows a different and more general architecture than transformers. Instead of predicting the next token, it models probability distributions over future interactions across multiple behavioral domains — for example, how past web activity might influence future actions in a call center. It learns from temporally ordered streams of events enriched with entity attributes and metadata, automatically discovering the relevant “tokens” or interaction types during training. While BaseModel is generative in the mathematical sense (it estimates probability distributions), it does not generate text, images, or content — it produces predictions aligned with classical ML tasks such as classification, recommendation, and regression.
See Synerise AI Research article for more information.
Is BaseModel a feature store or a vector database?
No. A feature store holds precomputed aggregates; a vector DB indexes existing embeddings. BaseModel computes features dynamically from raw events, producing fresh behavioral representations on every inference.
How does BaseModel process features and learn what matters?
BaseModel builds representations of features from multiple modalities — behavioral interactions, time series, text, image, and categorical attributes — and exposes them to a deep neural network during training. Through weight adjustment, the model learns which inputs contribute to its objective, but technically no features are discarded. Their relevance is surfaced through the interpretability module, which shows which data sources and columns had the greatest influence. Proprietary technologies such as Cleora and emde are used within the pipeline to build robust, universal behavioral representations that generalize across tasks.
See Synerise AI Research article to learn more.
What is BaseModel’s equivalent of LLM “tokens”? What does it predict?
BaseModel follows a more general architecture than transformers. It does not predict the “next token,” but rather a cartesian-product probability distribution across multiple disjoint subspaces (modalities) of interaction data.
- Each subspace represents a behavioral domain — for example, web, call center, or product usage — enriched with contextual metadata.
- The model learns mutual information flow between these subspaces, so activity in one domain can inform predictions in another.
- During self-supervised training, BaseModel learns to predict future interactions in all available domains given the historical sequence (the arrow of time is reversed). These “tokens” — events or interactions — become building blocks for downstream scenario models, whose target functions represent outcomes such as churn (“no further interactions”) or purchase (“specific interaction occurs”).
See Synerise AI Research article for more information.
Can BaseModel provide an advantage with sparse or heterogeneous data?
Yes. BaseModel is explicitly designed for heterogeneous and sparse interaction data across multiple channels.
-
Scalability: tens of millions of interactions per month are typical and easily handled.
-
Architecture: it consumes temporally ordered streams of interactions from all connected systems (web, mobile, contact center, POS, etc.).
-
Cross-domain learning: sparse domains can benefit from denser ones through shared latent representations. The key prerequisite is the ability to link streams using a consistent entity identifier (e.g.,
client_id) and a timestamp to synchronize behaviors.
Can I visualize graphs or embeddings produced by BaseModel?
No. BaseModel does not currently provide visualizations of learned embeddings. Insights about feature importance and relationships are accessible instead through the interpretability and attribution modules, not through visual embedding maps.
See Interpreting your model's predictions guide to learn more.
Is BaseModel deterministic?
BaseModel can produce either deterministic values or probability distributions, depending on the target function. For example, recommendation outputs are probability-based but can be made deterministic in practice through ranking (argsort). However, not all operations must be deterministic — the underlying logits can also be leveraged for reinforcement learning (RL) applications planned for future development.
Deployment & integrations
What does a BaseModel deployment look like in practice?
A typical deployment looks as follows:
-
Environment setup – Deploy a single Docker container in a GPU-enabled virtual environment (cloud or on-premises) that meets the minimum hardware requirements.
-
Data access configuration – Connect the environment to your enterprise data warehouse or provide Parquet files containing event and attribute data.
-
Foundation model configuration – Specify event tables, attribute tables, and temporal aggregates in the BaseModel configuration file.
-
Foundation model training – initiate the process to begin the automated ingestion, validation, and foundation model training sequence.
-
Initial scenarios training - define downstream models' target functions and initiate the training, validation and evaluation; this can require a few experimental iterations to achieve best model performance.
For most proofs of concept, full deployment and the first training cycle are completed within 2 weeks, assuming infrastructure and access are ready.
How long does it take to get BaseModel up and running?
If infrastructure and credentials are in place, BaseModel can be operational within a few hours. Most teams achieve a working deployment, with data connected and pretraining started, on day one. The first fine-tuned downstream model (e.g., churn prediction or product recommendation) can typically be ready within a week, depending on data volume, scenario complexity, and number of iterative experiments.
How does BaseModel integrate with external ML and data platforms?
BaseModel offers extensive connectivity across the data layer, with built-in support for all major storage and processing platforms, including Snowflake, Databricks, BigQuery, ClickHouse, and Hive, among others. It also integrates seamlessly with standard logging and monitoring systems, ensuring full observability across data ingestion and training workflows.
At the ML orchestration level, BaseModel already includes solid integrations with MLflow for experiment tracking and Snowflake for model deployment and scoring. Integration with Databricks is in progress, while broader automation for managed ML environments remains on the roadmap. These future extensions will further streamline BaseModel’s interoperability with enterprise AI platforms.
What artifacts does BaseModel generate?
BaseModel produces PyTorch feature embeddings (for its own internal use), model checkpoints, logs, predictions, and stored configuration files. Predictions are generated in batch mode, while real-time serving is handled externally by the Synerise Experience Platform.
See Working with checkpoints section for more details.
How scalable is BaseModel? Can I split CPU and GPU workloads?
It’s efficient on a single GPU but scales horizontally with multi-GPU and multi-node training setups. Preprocessing can run in CPU-only environments, while model training and inference require GPUs.
How often should I retrain models?
The optimal retraining frequency depends on the dynamics of your industry and data refresh rate. In most cases, retraining every four weeks is recommended to account for seasonal patterns, the introduction of new entities (such as products or stores), and shifts in behavioral distributions over time.
Monitoring, compliance & explainability
How is security, data privacy and governance handled?
All processing takes place entirely within your infrastructure — BaseModel does not transfer or copy any client data externally. BaseModel itself does not provide or enforce data governance - it serves as a framework for modeling, training, and interpretation, and it is the client’s responsibility to ensure that appropriate data governance and compliance processes are implemented where required.
Does BaseModel combine data across clients?
No. BaseModel is deployed for your private use, and models are trained exclusively on your private data. BaseModel does not perform federated learning or cross-company pretraining.
What does BaseModel monitor in production?
During training, users can configure BaseModel to log any available metrics and status information using TorchMetrics, PyTorch callbacks, and standard experiment loggers such as MLflow or Neptune. This allows full flexibility in monitoring model behavior and integrating with existing MLOps pipelines. It does not provide continuous monitoring services itself — ongoing tracking of data drift, concept drift, or long-term model performance must be handled through your own orchestration or scheduling system. All logs, metrics, and monitoring data remain entirely within your infrastructure and are never accessible to Synerise. Only the client can view, store, or analyze this information.
How interpretable are BaseModel results?
BaseModel provides event-level attribution, identifying which individual interactions contributed most to a prediction. This granularity allows analysis well beyond typical aggregate-level feature importance and enables direct mapping of behavioral drivers.
Can I explain a prediction made months ago?
Yes — as long as the model checkpoint and corresponding data are still available. BaseModel supports retrospective interpretation, but preserving checkpoints and data snapshots for audit and re-evaluation must be handled by the client within their own infrastructure and governance process.
What will happen when new vulnerability will be found in BaseModel?
Synerise as a provider of the BaseModel scans whole code for vulnerabilities frequently and once they are reported we release new version to address this within SLA aligned with client in the formal service agreement.
Updated 11 days ago
