Hold-out data #23

florianhartig · 2021-05-07T08:16:26Z

Hi @MaximilianPi, I'm just looking at the ML datasets, what do we want to do with the hold-out data for those, should we add them here and hide or maybe keep the with the submission server on the ML repo?

MaximilianPi · 2021-05-12T09:56:40Z

Hi,
two options:
a) two versions (full and without hold-out) of the datasets in ecodata package (as we did with the titanic and the plant-pollinator database), but the students were confused by the two different versions and one even said it is disappointing that the holdouts are theoretically available in the EcoData
b) as you said, only one version of the wine, nasa, and flower datasets without the holdouts (the holdouts are also available in this separate server submission repo: https://github.com/MaximilianPi/submission_server

If you want to use the wine, nasa, and flower datasets for other courses (e.g. the stats courses) I don't think it is a big problem if half of the data is missing, right?

I am in favour of b) or hiding them in EcoData

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hold-out data #23

Hold-out data #23

florianhartig commented May 7, 2021

MaximilianPi commented May 12, 2021

Hold-out data #23

Hold-out data #23

Comments

florianhartig commented May 7, 2021

MaximilianPi commented May 12, 2021