-
Notifications
You must be signed in to change notification settings - Fork 5
Add ensemble model #72
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Here is an updated specification of what the final ensemble should look like. I would suggest not to implement all of this in one go, but to implement it in several iterations. There are two main components: ControllerDecides which models get used
ConsolidatorDecides how to aggregate model results into a single prediction
For now, we should skip the Controller and confidence scores. Currently, we have no models that we can't run without activation conditions. Confidence scores will come into play if new models that produce such scores are added (e.g., LNNs). |
Problem
Currently, we have a range of different approaches for classifying molecules in ChEBI (ELECTRA-based, GNN-based (https://github.com/ChEB-AI/python-chebai-graph) and algorithmic / logic-based (https://github.com/sfluegel05/chemlog2).
All approaches have specific strengths and weaknesses. The goal of an
ensemble
is to take different methods and aggregate their predictions so that the final result is better than the individual results.Task
The architecture of the ensemble methods should take the following input:
It should aggregate these values into a single prediction (for each class), taking into account the predictions of each model and the "trustworthiness" of the model (this score is specific to each class, and possibly different for positive and negative predictions).
Example:
Given a ChEBI class, we have received the following predictions:
The simplest approach would be to weight all models equally and return
true
for this class (with a 2-1 vote). However, we should also take the trustworthiness into account. These values might come from the precision / true predictive value (TPV; TP / (TP + FP)) and negative predictive value (NPV; TN / (TN + FN)) of a model on a test set.In other words: If model A and model C predict "true" for this class, they are correct in 70% and 60% of cases (according to their TPV). If model model B predicts "false" for this class, it is correct in 99% of cases (according to the NPV).
An aggregation method would then weight two predictions with "trustworthiness" of 0.7 and 0.6 against one with 0.99. Depending on the aggregation method used, it might decide to trust model B.
Future work
The text was updated successfully, but these errors were encountered: