Skip to content

Releases: mad-lab-fau/tpcp

v2.0.0 - Major scorer rework

24 Oct 10:12
Compare
Choose a tag to compare

[2.0.0] - 2024-10-24

Added

  • The global cache helper now support algorithms with multiple action methods by specifying the name of the action
    method you want to cache.
    (#118)
  • Global disk cache helper should now be able to cache the action methods of algorithm classes defined in the main
    script.
    (#118)
  • There are new builtin FloatAggregator and MacroFloatAggregator that should cover many of the use cases that
    previously required custom aggregators.
    (#118)
  • Scorers now support passing a final_aggregator. This is called after all scoring and aggregation happens and allows
    to implement complicated "meta" aggregation that depends on the results of all scores of all datapoints.
    Note, that we are not sure yet, if this should be used more as an escape hedge and overusing it should be considered
    an anti-pattern, or if it is exactly the other way around.
    We need to experiment in a couple of real-life applications to figure this out.
    (#120)
  • Dataset classes now have a proper __equals__ implementation.
    (#120)

Changed

  • Relative major overhall of how aggregator in scoring functions work. Before, aggregators were classes that were
    initialized with the value of a score. Now they are instances of a class that is called with the value of a score.
    This change allows it to create "configurable" aggregators that get the configuration at initialization time.
    (#118)
    This comes with a couple of breaking changes:
    • The most "user-facing" one is that the NoAgg aggregator is now called no_agg indicating that it is an instance
      of a class and not a class itself.
    • All custom aggregators need to be rewritten, but you will likely find, that they are much simpler now.
      (see the reworked examples for custom aggregators)

Fixed

  • Fixed massive performance regression in version 0.34.1 affecting people that had tensorflow or torch installed, but
    did not use it in their code.
    The reason for that was, that we imported the two modules in the global scope, which caused importing tpcp to be very
    slow.
    This was particularly noticeable in case of multiprocessing, as the module was imported in every worker process.
    We now only import the module, within the clone function and only, if you had imported it before.
    (#118)
  • The custom hash function now has a different way of hashing functions and classes defined in local scopes.
    This should prevent strange pickling errors from just using "tpcp" normally.
    (#118)

Removed

  • score functions implemented directly as method on the pipeline class are no longer supported.
    Score functions now need to be independent functions that take a pipeline instance as their first argument.
    For this reason, it is also no longer supported to pass None as argument to scoring in any validate or optimize
    method.
    (#120)

v1.0.1 - Resolved install issues with UV

18 Oct 09:00
Compare
Choose a tag to compare

[1.0.1] - 2024-10-18

Fixes names of optional dependency groups. That should resolve install issues when using uv as package manager.

v1.0.0 - Cross-Validation improved!

03 Jul 09:11
Compare
Choose a tag to compare

[1.0.0] - 2024-07-03

Note: This is a major version bump, because we have quite substantial breaking changes. The 1.0 should not signal that we
are now feature complete. Though the core APIs have been mostly stable for quite some time now.

BREAKING CHANGE

  • Instead of the (annoying) mock_label and group_label arguments, all functions that take a cv-splitter as input,
    can now take an instance of the new DatasetSplitter class, which elegantly handles grouping and stratification and
    also removes the need of forwarding the mock_label and group_label arguments to the underlying optimizer.
    The use of the mock_label and group_label arguments has been removed without depreciation.
    (#114)
  • All classes and methods that "grid-search" or "cross-validate" like output (GridSearch, GridSearchCv, cross_validate, validate)
    have updated names for all their output attributes.
    In most cases the output naming has switched from a single underscore to a double underscore to separate the different
    parts of the output name to make it easier to programmatically access the output.
    (#117)

v0.34.1 - Fix Torch and Tensorflow support

02 Jul 15:15
Compare
Choose a tag to compare

Fixed

  • The torch hasher was not working at all. This is hopefully fixed now.
  • The tensorflow clone method did not work. Switched to specialized implementation that hopefully works.

v0.34.0 - Some smaller improvments

28 Jun 12:21
Compare
Choose a tag to compare

[0.34.0] - 2024-06-28

Added

  • Dataset classes are now generic and allow you to provide the group-label tuple as generic. This allows for better type
    checking and IDE support. (#113)

Changed/Fixed

  • The snapshot utilities are much more robust now and rais appropriate errors when the stored dataframes have
    unsupported properties. (#112)

v0.33.1 - Less caching warnings

14 Jun 13:40
Compare
Choose a tag to compare
Less cahching warnings and closes #111

v0.33.0 - Some more TypedIterator stuff and some QoL improvements

23 May 12:38
Compare
Choose a tag to compare

[0.33.0] - 2024-05-23

Added

  • custom_hash the internally used hashing method based on pickle is now part of the public API via tpcp.misc.
  • DummyOptimize allows to ignore the warning that it usually throws.

Changed

  • Relative large rework of the TypedIterator. We recommend to reread the example.

v0.32.0 - Better snapshots

17 Apr 14:45
Compare
Choose a tag to compare

[0.32.0] - 2024-04-17

  • The snapshot plugin now supports a new command line argument --snapshot-only-check that will fail the test if no
    snapshot file is found. This is usefull for CI/CD pipelines, where you want to ensure that all snapshots are up to
    date.
  • The snapshot plugin is now installed automatically when you install tpcp. There is no need to set it up in the conftest
    file anymore.

v0.31.2 - More Typed Iterator fixes

01 Feb 11:58
Compare
Choose a tag to compare

[0.31.2] - 2024-02-01

Fixed

  • TypedIterator does not run into a RecursionError anymore, when attributes with the wrong name are accessed.

v0.31.1 - Fix agg in typed iterator

01 Feb 10:33
Compare
Choose a tag to compare

[0.31.1] - 2024-02-01

Fixed

  • TypedIterator now skips aggregation when no values are provided