Release 0.9.0

Features

  • Grouped Decimal Features in Interpretability
    Introduced the ability to handle and analyze grouped decimal features, enhancing model interpretability by offering more granular insights into feature behavior.

  • Event Attributions to interpret recommendation models
    With event attributions, users can now trace back and understand how specific events influence model outputs and predictions.

  • Prediction Storage in Snowflake Database
    Added functionality to save predictions directly into a Snowflake database, simplifying data integration and storage workflows.

  • Data Source Name in Minimum Group Size Logs
    Added logging of the data source name when enforcing minimum group size requirements, making it easier to identify and troubleshoot group size issues.

  • Join Functionality for Attribute Data Sources (enhanced)
    Expanded support to allow joining attribute data sources with multiple data sources, enabling richer and more complex data merging capabilities.

  • Filtering on Extra Columns in Data Source Definition
    Users can now filter, group, and leverage extra columns passed in the data source definition for audience building purposes. Previously, these columns were restricted to BaseModel for Foundation Model training.

  • New Parameter in DataParams: training_end_date
    Introduced the training_end_date parameter, with a default value set to validation_start_date - 1, providing more flexibility and control over model training timelines.

  • New Parameters in TestingParams: local_save_location, remote_save_location
    Introduced local_save_location and remote_save_location as parameters within TestingParams, replacing the previous save_path to offer greater clarity and customization in specifying where to save test results.

    🚧

    Note

    Please adapt your configuration file to reflect this syntax change.

  • Extended Group Max Retries
    Default values of group computation retries and retry interval has been increased. Default forGROUPS_N_RETRIES is not set to 20 and default for GROUPS_RETRY_INTERVAL is now set to 60. Increasing the time of computation reduces the likelihood of failures due to transient issues and improves overall robustness. For more information see num_groups in Dividing event tables section.

  • Entity Number Limit for Target Function Validation
    The number of entities that can be used when validated target functions is no capped to ensure efficiency and prevent overload during validation process.

  • Enhanced Debug Messages for Target Function Validation
    More comprehensive debug messages have been added during target function validation to assist in troubleshooting and increase transparency in the validation process.

Fixes

  • Fixed None value causing issues in grouping.
  • Fixed regression loss calculation and logging.
  • Fixed an error when pandas query is not able to parse groups from joined data sources
  • Made Neptune alerter log system metrics to common namespace.
  • Removed unused validation for the main entity attribute data source & unused loss function.
  • Converted sparse batch to dense in interpretability.
  • Fixed handling of metrics not found in Neptune.
  • Fixed memory consumption by replacing list construction with a generator.
  • Created directory based on cache path.
  • Added schema to columns selection in Hive builder.
  • Handled potential NaNs in decimal calculator.

Docs

  • Updated the documentation navigation to be more readable and user-friendly.
  • Added Recipes section for easy reference when building target functions.