Open
Description
Describe the feature you want to propose or implement
Implement a general policy learning method based on multiple possible treatments.
Propose a possible solution or implementation
Based on Policy Learning with Confidence.
Using the APOs and several possible weighted options (and e.g. tree search) one could implement the proposed algorithms as the solely rely on value estimates and standard errors.