Skip to content

0.5.0

Compare
Choose a tag to compare
@usaito usaito released this 07 Sep 01:35
· 216 commits to master since this release
dee1752

The changes are summarized below:

Major updates

  • Add OPE/OPE with Continuous Actions
    • SyntheticContinuousBanditDataset (#112 )
    • ContinuousOPEEstimators [1] (#113 )
    • ContinuousNNPolicyLearner [2] (#114 )
  • Add Weight clipping to IPW and DR (#115 )
  • Add Automatic Hyperparameter Tuning of OPE estimators [3] (#116, #131 )
  • Add arguments to the SyntheticBanditDataset class to generate more flexible synthetic data (#123 )
  • Add subsample option to the OpenBanditDataset class (#118 )
  • Modify an input type of off_policy_objective argument and Add some hyperparameters to NNPolicyLearner (#132)

Minor updates

  • Fix README (#119 )
  • Fix Scalar value checking (#122 )
  • Add ValueError to OffPolicyEvaluation class (#125 )
  • Fix Error messages (#126 )
  • Add Some Errors (#125, #129 )
  • Update Quickstart examples (#127 )

Cautions

  • the hyperparameter name of obp.ope.SwitchDoublyRobust has changed to lambda_ from tau
  • the type of argument off_policy_objective of obp.policy.NNPolicyLearner has changed to str from callable

References

  • Nathan Kallus and Angela Zhou. Policy Evaluation and Optimization with Continuous Treatments, AISTATS2018.
  • Nathan Kallus and Masatoshi Uehara. "Doubly Robust Off-Policy Value and Gradient Estimation for Deterministic Policies", NeurIPS2020.
  • Yi Su, Maria Dimakopoulou, Akshay Krishnamurthy, and Miroslav Dudik. "Doubly Robust Off-Policy Evaluation with Shrinkage.", ICML2020.