HomeGuidesRecipesChangelog
Changelog

New features

  • Add dimension checks in interpretability
  • Allow to set maximum percentage of nulls in a column
  • Forbid undefined fields in configs
  • Handle none as entity_id in parquet files
  • Create new data source definition.
  • New config.yaml design.
  • Enable caching queried data
  • Handle duplicated column names when joining tables.
  • Support parquet data source.
  • Fix monad metrics
  • Validate allowed_columns.
  • Support lambdas at config level
  • Implement mechanism for metric initialization
  • Add joins to benchmarking configs
  • Cast main_entity_id to string
  • Validate columns uniqueness
  • Allow defining lambdas in extra columns
  • Add recommendations to interpretability
  • Set max number of expressions via environment variable
  • Verify if data source name contains any forbidden sequences

Fixes

  • Add recency modality slices to feature value interpretability
  • Allow join_on column in select
  • Allow None value for limit_train_batches
  • Always use stored config at pretraining phase
  • Changed defaults for loader params
  • Check data source type before accessing date column
  • Fix Recommendation model
  • Fix to date parsing in hive
  • Make snowflake config work with new setup
  • Use alias and table name correctly
  • Fix metrics in training params
  • Append suffix to with clause alias
  • Fix detecting cyclic joins.

0.6.0 (2024-04-23)

Features

  • Add BM colors to interpretability plot
  • Add interpret function for use in scripts
  • Add methods for weighting training examples
  • Adjust hive to use ini files
  • Enable setting 'ignore_entities_without_events' flag.
  • Extract queries from connectors
  • Create common mechanism for query execution
  • Refactor query builders
  • Add treemap visualization
  • Add treemap generation from predefined hierarchy
  • Replace sampling method with actual sampling
  • Make attribution average optional.
  • Introduce Python 3.11
  • Add regression task to interpretability
  • Support training resuming
  • Create chunks based on partition column
  • Support booleans in fit stage

Fixes

  • Add quotation marks around table names in dialect providers
  • Add quotation marks to entity ids subquery
  • Add reset method to LongCastingMetric
  • Add return statement to FM get trainable module
  • Cast Hive decimal columns to float
  • Change cache dir type
  • Fixing id info parsing
  • Handle empty iterator while caching
  • Hash sketches hashing function and tests
  • Add options to change interpretability sample size
  • Fix time shift when caching datetimes.
  • Handle decimal types in Hive training iterator.
  • Fix ignore_entities_without_events flag
  • Fix combining tiles with the same name and different id
  • Catching prediction on None object and fixing runtime threshold
  • Remove dask-ml, bump ray, use compatible dask version
  • Set enable_checkpointing flag accordingly to the callbacks setup
  • Small fix in one-hot-encoders
  • Stop logging warnings for uppercase unquoted columns in snowflake

0.5.0 (2024-01-18)

Features

  • Add interpretability
  • Add logging column names
  • Add resume option for columns & fix minor bug related to text columns processing
  • Add target filtering to the inference module.
  • Use PyODBC for connecting with Hive.

Fixes

  • Chunking in hive queries fixed
  • Convert max num columns to int
  • Fix cleora circular dependency imports