Open
Description
Both in the retain_train.py script and in the retain_evaluation.py script the same dataset is used as test and evaluation, is this correct? What I mean is that after each epoch the data from the test set is used to measure the model performance, and then the same data is used to evaluate and get the analytical graphs, but wouldn't the evaluation have to be done on a different data set?
Another question I had is regarding model calibration, during model evaluation a calibration graph is calculated and drawn, but during training the model is not calibrated, is there a reason why this is not done?
Sorry if I missed something that might be obvious and thanks for your help and time.
Metadata
Metadata
Assignees
Labels
No labels