build_best versus build_merged what is the difference if no testset is given. #39

JanoschMenke · 2025-01-11T15:01:29Z

I just wanted to ask what the difference between build_best versus build_merged is, when there is no testset specified such as here

config = OptimizationConfig(
    data=Dataset(
        input_column="canonical",
        response_column="molwt",
        training_dataset_file="tests/data/DRD2/subset-50/train.csv",
        split_strategy= Random(),
    ),
)

Based on the results I get I assume that build_merged is trained on the complete training_dataset supplied, while build best is trained only on a subset, I assume generated by the split_strategy but when I am using a 10-fold criss validation which of the 10 fold is it split by?

# Build (re-Train) and save the best model.
build_best(buildconfig, "target/best.pkl")

# Build (Train) and save the model on the merged train+test data.
build_merged(buildconfig, "target/merged.pkl")

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

build_best versus build_merged what is the difference if no testset is given. #39

build_best versus build_merged what is the difference if no testset is given. #39

JanoschMenke commented Jan 11, 2025

build_best versus build_merged what is the difference if no testset is given. #39

build_best versus build_merged what is the difference if no testset is given. #39

Comments

JanoschMenke commented Jan 11, 2025