Testing Parameters

Parameters for model evaluation and prediction generation. Used with the .test() and .predict() methods on a scenario model.

test() vs predict()

Both methods accept TestingParams, but they differ in behavior:

	`test()`	`predict()`
Purpose	Evaluate against ground truth	Score new/unseen data
Metrics	Computed and logged	Ignored
Data split	Uses the test split from training	Uses `prediction_date` as cutoff
`prediction_date`	Optional (defaults to test split boundary)	Required for entity-based splits

Use test() during development to measure model quality. Use predict() to generate production predictions.

from monad.config import TestingParams, OutputType

`TestingParams`

Parameter	Type	Default	Description
`output_type`	`OutputType`	required	Format in which to save the predictions. See OutputType below.
`local_save_location`	`Path \\| None`	`None`	Local file path for predictions in TSV format. Must end with `.tsv`.
`remote_save_location`	`DataLocation \\| None`	`None`	Remote database table for storing predictions. Snowflake and Databricks are supported.
`limit_test_batches`	`int \\| None`	`None`	Limit number of test/predict batches to process.
`precision`	`Literal[...]`	`32`	Float precision used for testing. See Precision Values.
`prediction_date`	`datetime \\| None`	`None`	Date for which to make predictions. Required when using entity-based splits.

Inherited Parameters

Shared with TrainingParams:

Parameter	Type	Default	Description
`devices`	`list[int] \\| int`	`1`	GPU devices to use.
`accelerator`	`"cpu" \\| "gpu"`	`"gpu"`	Accelerator type.
`strategy`	`str \\| None`	`None`	Distributed strategy.
`metrics`	`list[MetricParams \\| CustomMetric]`	`[]`	Metrics to compute during testing. See MetricParams and CustomMetric.
`top_k`	`int \\| None`	`None`	Limit predictions to top-k items/classes (recommendation, multilabel).
`predictions_threshold`	`float \\| None`	`None`	Classification threshold (binary, multilabel). Mutually exclusive with `top_k`.
`entity_ids`	`EntityIds \\| None`	`None`	Limit predictions to specific entity IDs.
`callbacks`	`list[Callback]`	`[]`	PyTorch Lightning callbacks.
`approximate_decoding_params`	`ApproximateDecodingParams \\| None`	`None`	Approximate decoding for recommendation tasks.

`OutputType`

from monad.config import OutputType

# Available values:
OutputType.RAW_MODEL
OutputType.ENCODED
OutputType.DECODED
OutputType.SEMANTIC

Meaning Per Task Type

Task	`RAW_MODEL`	`ENCODED`	`DECODED`	`SEMANTIC`
Binary	Logits	Logits	Probabilities	0 or 1 (based on threshold)
Multiclass	Log-softmax	Log-softmax	Probabilities (with filtering)	Class names (with filtering)
Multilabel	Logits	Logits	Probabilities (with filtering)	Class names (with filtering, requires `top_k`)
Regression	Raw output	Internal representation	Human-readable values	Human-readable values
Recommendation	Raw output	Sketch (compact)	Probabilities per item	Item IDs/names

For all task types except recommendations, we suggest using DECODED.

Tip

Use DECODED for most inference pipelines. Use SEMANTIC when results need to be human-readable. Use ENCODED for recommendation models when you need compact output that can be decoded later with readout_sketch().

Usage Examples

Basic Test and Predict

Both methods accept an optional seed parameter for reproducible ordering of results.

from pathlib import Path
from monad.config import TestingParams, OutputType, MetricParams

testing_params = TestingParams(
    output_type=OutputType.DECODED,
    devices=[0],
    local_save_location=Path("./predictions.tsv"),
    metrics=[
        MetricParams(alias="auroc", metric_name="AUROC"),
    ],
)

# Test — loads checkpoint, returns predictions
results = module.test(testing_params)

# Predict — loads checkpoint, saves predictions to local/remote location
module.predict(testing_params, seed=42)

Recommendation with Top-K

testing_params = TestingParams(
    output_type=OutputType.SEMANTIC,
    devices=[0],
    top_k=10,
    local_save_location=Path("./top10_predictions.tsv"),
)

module.predict(testing_params)

Multi-GPU Inference

testing_params = TestingParams(
    output_type=OutputType.DECODED,
    devices=[0, 1, 2, 3],
    strategy="ddp",
    local_save_location=Path("./predictions.tsv"),
)

module.predict(testing_params)

Writing Predictions to Snowflake

from monad.config import TestingParams, OutputType
from monad.config.data_source import DataLocation

testing_params = TestingParams(
    output_type=OutputType.DECODED,
    devices=[0],
    remote_save_location=DataLocation(
        database_type="snowflake",
        connection_params={
            "user": "${SNOWFLAKE_USER}",
            "password": "${SNOWFLAKE_PASSWORD}",
            "account": "${SNOWFLAKE_ACCOUNT}",
            "warehouse": "${SNOWFLAKE_WAREHOUSE}",
            "database": "MY_DATABASE",
            "schema": "PUBLIC",
        },
        table_name="predictions_output",
    ),
)

module.predict(testing_params)

Writing Predictions to Databricks

from monad.config import TestingParams, OutputType
from monad.config.data_source import DataLocation

testing_params = TestingParams(
    output_type=OutputType.DECODED,
    devices=[0],
    remote_save_location=DataLocation(
        database_type="databricks",
        connection_params={
            "host": "${DATABRICKS_HOST}",
            "warehouse_id": "${DATABRICKS_WAREHOUSE_ID}",
            "token": "${DATABRICKS_TOKEN}",
        },
        table_name="predictions_output",
    ),
)

module.predict(testing_params)

Databricks write behavior

The target table is created on demand if it does not exist, and rows are appended in batches (no surrounding transaction). Tune the batch size with the DATABRICKS_WRITE_BATCH_SIZE environment variable (default 1000).

TSV Output Schema

When local_save_location is set, predictions are saved as a tab-separated file. The columns depend on the task type and output_type:

Task	Columns
Binary	`entity_id`, `score`, `label`
Multiclass	`entity_id`, `score_<class1>`, `score_<class2>`, ..., `label`
Multilabel	`entity_id`, `score_<class1>`, `score_<class2>`, ..., `label_<class1>`, ...
Regression	`entity_id`, `prediction`, `label`
Recommendation	`entity_id`, `item_id`, `score` (with `SEMANTIC`/`DECODED`)

Tip

Inspect the header before parsing: head -1 predictions.tsv. Column names and ordering vary by task type and output type. The label column is only present in test() output (not predict()).

Prediction Utilities

`readout_sketch()`

Decode recommendation predictions saved with OutputType.ENCODED into per-item scores.

from monad.ui.module import readout_sketch

generator = readout_sketch(
    predictions_file="./predictions.tsv",
    checkpoint_path="./reco_model",
)

for entity_id, scores in generator:
    print(f"Entity: {entity_id}, Scores shape: {scores.shape}")

Parameter	Type	Description
`predictions_file`	`str`	Path to predictions file saved with `OutputType.ENCODED`.
`checkpoint_path`	`str`	Path to the recommendation model checkpoint.

Returns a generator yielding (entity_id: str, scores: np.ndarray) tuples.

`read_target_entity_ids()`

Get the mapping from target entity IDs (e.g., product IDs) to their indices in the decoded sketch.

from monad.ui.module import read_target_entity_ids

target_to_index = read_target_entity_ids(
    checkpoint_path="./reco_model",
)
# Returns: {"product_001": 0, "product_002": 1, ...}

Parameter	Type	Description
`checkpoint_path`	`str`	Path to the recommendation model checkpoint.

Returns a dict[str, int] mapping entity IDs to indices.