Key Concepts
Understanding these concepts will help you work effectively with BaseModel.
Data
Entities
Entities are the objects represented in your data — customers, users, accounts, products, merchants, content items, and so on. BaseModel distinguishes between:
- Main entity — the primary prediction target, typically a customer, user, or account. Each main entity has a unique identifier and a history of events.
- Additional entities — objects that appear within events and enrich the behavioral context, such as products, merchants, or content items.
Events
Timestamped records of entity behavior: transactions, clicks, logins, support calls, subscription changes — any action with a timestamp. Events are the core input to the foundation model. Multiple event sources (e.g. purchases and page views) can be combined in a single model.
Attributes
Static or slowly changing properties of entities — demographics, account type, region, registration date. Attributes provide context that complements the behavioral signal from events.
Data Model
BaseModel expects two types of input tables:
- Events capture what entities do over time. Attributes capture what entities are.
- You can connect multiple event tables (e.g. purchases, clicks, support tickets) and multiple attribute tables. BaseModel joins them automatically by entity ID.
- At least one event table is required. Attribute tables are optional.
Foundation Model
Behavioral Representation Fitting
Before training begins, BaseModel examines your data and prepares it for the foundation model. This stage runs automatically:
- Data inspection — Reviews column types and value distributions, applies heuristics to select the best encoding for each field
- Behavioral embeddings — Uses Cleora to learn how entities relate through their interactions, and EMDE to build compact representations of each entity's behavioral density
- Additional modalities — Encodes text, images, and other non-behavioral fields using open-source models
- Feature assembly — Combines all representations into a unified feature set ready for training
Foundation Model Training
The foundation model learns from all your event data using self-supervised learning — no labels needed. It discovers temporal rhythms, sequential patterns, cross-event correlations, and behavioral segments purely from the structure of the data.
Think of it as teaching the model to understand how entities behave — what patterns are typical, what signals precede change, how different actions relate — without telling it what specific question to answer. This general understanding then transfers to any downstream prediction task.
Temporal Data Splits
To create training examples, BaseModel uses timestamps as split points that divide each entity's event stream into two segments:
- History — Events before the split point, used to compute input features
- Future — Events after the split point, used to derive the prediction target
During foundation model training, split points are generated between every pair of events for each entity, and a random subset is sampled each epoch. This means the model sees the same entity from many different points in time, building robust representations that generalize across prediction windows.
During inference, the model sees only the history — the future is what it predicts.
Scenario Model
Fine-Tuning
A scenario model is a task-specific model built on top of the foundation. Rather than learning behavioral patterns from scratch, it inherits the foundation's understanding and adapts it to a concrete business question — churn prediction, spend forecasting, product recommendation, etc.
Because the heavy lifting is already done by the foundation, scenario models train quickly and need far less data to perform well. You can train many scenario models from a single foundation.
Target Function
The target function is a Python function you write that tells BaseModel what to predict. It receives the time-split history and future for each entity, and returns a label — for example, whether a customer churned, how much they will spend, or which products they are likely to buy. This is the most important customisation point: the target function is how you encode your business logic into the model. See Target Function for the full API.
Task Types
BaseModel supports five task types:
| Task Type | What You Predict |
|---|---|
| Binary Classification | Yes / no outcome |
| Multiclass Classification | One category from a set |
| Multilabel Classification | Multiple categories at once |
| Regression | Numeric value(s) |
| Recommendation | Ranked list of items |