Key Concepts

Understanding these concepts will help you work effectively with BaseModel.

Data

Entities

Entities are the objects represented in your data — customers, users, accounts, products, merchants, content items, and so on. BaseModel distinguishes between:

Main entity — the primary prediction target, typically a customer, user, or account. Each main entity has a unique identifier and a history of events.
Additional entities — objects that appear within events and enrich the behavioral context, such as products, merchants, or content items.

Events

Timestamped records of entity behavior: transactions, clicks, logins, support calls, subscription changes — any action with a timestamp. Events are the core input to the foundation model. Multiple event sources (e.g. purchases and page views) can be combined in a single model.

Attributes

Static or slowly changing properties of entities — demographics, account type, region, registration date. Attributes provide context that complements the behavioral signal from events.

Data Model

BaseModel expects two types of input tables:

Events capture what entities do over time. Attributes capture what entities are.
You can connect multiple event tables (e.g. purchases, clicks, support tickets) and multiple attribute tables. BaseModel joins them automatically by entity ID.
At least one event table is required. Attribute tables are optional.

Foundation Model

Behavioral Representation Fitting

Before training begins, BaseModel examines your data and prepares it for the foundation model. This stage runs automatically:

Data inspection — Reviews column types and value distributions, applies heuristics to select the best encoding for each field
Behavioral embeddings — Uses Cleora to learn how entities relate through their interactions, and EMDE to build compact representations of each entity's behavioral density
Additional modalities — Encodes text, images, and other non-behavioral fields using open-source models
Feature assembly — Combines all representations into a unified feature set ready for training

Foundation Model Training

The foundation model learns from all your event data using self-supervised learning — no labels needed. It discovers temporal rhythms, sequential patterns, cross-event correlations, and behavioral segments purely from the structure of the data.

Think of it as teaching the model to understand how entities behave — what patterns are typical, what signals precede change, how different actions relate — without telling it what specific question to answer. This general understanding then transfers to any downstream prediction task.

Temporal Data Splits

To create training examples, BaseModel uses timestamps as split points that divide each entity's event stream into two segments:

History — Events before the split point, used to compute input features
Future — Events after the split point, used to derive the prediction target

During foundation model training, split points are generated between every pair of events for each entity, and a random subset is sampled each epoch. This means the model sees the same entity from many different points in time, building robust representations that generalize across prediction windows.

During inference, the model sees only the history — the future is what it predicts.

Scenario Model

Fine-Tuning

A scenario model is a task-specific model built on top of the foundation. Rather than learning behavioral patterns from scratch, it inherits the foundation's understanding and adapts it to a concrete business question — churn prediction, spend forecasting, product recommendation, etc.

Because the heavy lifting is already done by the foundation, scenario models train quickly and need far less data to perform well. You can train many scenario models from a single foundation.

Target Function

The target function is a Python function you write that tells BaseModel what to predict. It receives the time-split history and future for each entity, and returns a label — for example, whether a customer churned, how much they will spend, or which products they are likely to buy. This is the most important customisation point: the target function is how you encode your business logic into the model. See Target Function for the full API.

Task Types

BaseModel supports five task types:

Task Type	What You Predict
Binary Classification	Yes / no outcome
Multiclass Classification	One category from a set
Multilabel Classification	Multiple categories at once
Regression	Numeric value(s)
Recommendation	Ranked list of items