Validation & Best Practices
This page covers how to validate your target function before training, common mistakes to avoid, and performance tips.
Validating with verify_target()
Before launching a full training run, use verify_target() to check that your target function executes without errors and returns the values you expect. It runs the function against a sample of entities from your foundation model data and reports problems early:
- Executes
target_fnon a sampled subset of entities - Checks return type and shape match the task
- Tracks
Nonerate and fails if it exceeds the allowed threshold - Surfaces exceptions with a stack trace and sample entity context
Basic Usage
from monad.ui.module import BinaryClassificationTask
from monad.ui.target_function import verify_target
verify_target(
target_fn=my_target_fn,
fm_checkpoint_path="./foundation_model",
task=BinaryClassificationTask,
)
Pass extra columns to verify_target()
If your target function uses extra columns, pass data_params_overrides with the same extra_columns — otherwise validation will fail with a KeyError. See verify_target() for the full parameter reference.
If validation passes, verify_target() returns one example non-None target value (for quick inspection). If something is wrong, it raises one of:
| Error | Meaning |
|---|---|
TypeError |
Return type is incorrect or inconsistent across entities |
ValueError |
Too many entities returned None (exceeds percentage_nones_allowed) |
RuntimeError |
The target function itself failed during execution |
For the full parameter reference, see verify_target().
Full Example
from typing import Dict
from datetime import timedelta
import numpy as np
from monad.ui.module import RegressionTask
from monad.ui.target_function import (
Attributes, Events, verify_target,
has_incomplete_training_window, SPLIT_TIMESTAMP,
)
def ltv_target(
history: Events, future: Events, entity: Attributes, ctx: Dict
) -> np.ndarray:
if history["transactions"].count() < 2:
return None
if has_incomplete_training_window(ctx, timedelta(days=30)):
return None
future_30d = future.interval_from(ctx[SPLIT_TIMESTAMP], timedelta(days=30))
return np.array([future_30d["transactions"].sum("price")], dtype=np.float32)
result = verify_target(
target_fn=ltv_target,
fm_checkpoint_path="./foundation_model",
task=RegressionTask,
num_percentage_entities=5, # check 5% of entities
percentage_nones_allowed=70, # allow up to 70% None
log_every_n_steps=100, # progress logging
)
print("Example target value:", result)
Debugging Failures
RuntimeError
Wrap risky sections in try/except, print counts, guard empty data sources withif txns.count() == 0TypeError
Printtype(result),result.shape, andresult.dtype— verify they match the taskValueError
Your function returns too manyNonevalues. Log your eligibility conditions, temporarily relax filters, uselog_every_n_stepsto see which entities are skipped
Validation Tips
-
Start small.
Leavenum_percentage_entitiesat1(the default) for initial debugging. Increase it once the function runs cleanly to get a better picture of yourNonerate. -
Watch your None rate.
A highNonerate isn't necessarily wrong — cold-start filtering and incomplete windows are legitimate reasons to exclude entities. But if the rate is unexpectedly high, it often signals a bug in your filtering logic. Adjustpercentage_nones_allowedto match your expectation. -
Pass
data_params_overridesif you use extra columns.
Validation loads data from the foundation model checkpoint. If your target function accesses extra columns, you need to supply the sameDataParamswithextra_columnsdefined, otherwise the function will fail with aKeyError. -
Enable progress logging.
Setlog_every_n_steps(e.g.,100) to print a status line every N entities — useful for spotting where failures cluster or confirming the run is progressing. -
Use
limitfor large datasets.
If your foundation model covers millions of entities, setlimitto cap the number of evaluations and keep iteration fast.
Common Mistakes
-
Forgetting to check for empty data
Calling aggregations on an emptyDataSourceEventswill crash. Guard withif txns.count() == 0: return Nonebefore computing a label. -
Missing
.eventsor.attribute
txns["price"]returns aModalityEventsobject, not an array. Usetxns["price"].eventsto get thenp.ndarray. Similarly,attributes["customers"]["age"]returns aModalityAttribute; use.attributeto get the value. -
Leaking future information into eligibility checks
Usehistory(orattributes) to decide whether an entity qualifies. Usefutureonly to compute the label. If you filter entities based on future data, the model receives a signal it cannot reproduce at inference time.Wrong — filtering on futurefuture_30d = future.interval_from(ctx[SPLIT_TIMESTAMP], timedelta(days=30)) if future_30d["transactions"].count() == 0: return None # removes all churners from training!Correct — filtering on history, labeling from future# eligibility: entity must have history if history["transactions"].count() == 0: return None # ensure a complete target window if has_incomplete_training_window(ctx, timedelta(days=30)): return None # label: use future only to compute the target future_30d = future.interval_from(ctx[SPLIT_TIMESTAMP], timedelta(days=30)) return np.array([1 if future_30d["transactions"].count() == 0 else 0], dtype=np.float32) -
Skipping
has_incomplete_training_window
Without this check, split points near the end of your data produce windows shorter than intended, creating inconsistent labels. -
Forgetting to unpack
.groupBy
All'groupByoperations return(values, names). Always unpack:counts, names = txns.groupBy("category").count(). -
Wrong timedelta direction
Positive timedelta moves forward (into future), negative moves backward (into history). A common error is using a positive timedelta onhistory, which selects nothing.
Performance Tips
- Use built-in aggregations
txns.sum("price")is faster than manual loops over.events. - Exit early
ReturnNoneas soon as you know an entity is invalid. - Cache data source access
Assigntxns = future["transactions"]once and reuse. -
Avoid manual loops
Use the built-in aggregation and filtering methods rather than iterating over.eventsarrays yourself.
Putting It Together
Once your target function is validated, wire it into a training run:
# --- Imports ---
from pathlib import Path
from datetime import timedelta
import numpy as np
from monad.ui.module import load_from_foundation_model, BinaryClassificationTask
from monad.ui.config import TrainingParams
from monad.ui.target_function import Events, Attributes, has_incomplete_training_window, SPLIT_TIMESTAMP
# --- Task & target function ---
task = BinaryClassificationTask()
def churn_target(history: Events, future: Events, entity: Attributes, ctx: dict):
...
return np.array([1 if future["transactions"].count() == 0 else 0],
dtype=np.float32)
# --- Load from foundation model ---
module = load_from_foundation_model(
checkpoint_path="./foundation_model",
downstream_task=task,
target_fn=churn_target,
)
# --- Execute training ---
training_params = TrainingParams(
checkpoint_dir=Path("./churn_model"),
...
)
module.fit(training_params=training_params)
Once task and target are defined, move on to Model Configuration to set up and run training.