we need comparative forecasting tests for catalog-based tests. some options are below: 1. mean information gain from nandan et al., 2019 2. parimutuel gambling score