Skip to content

bulk_register can fail when existing trials are all Failed and do not have result columns yet #831

Open
@eujing

Description

@eujing

If an experiment happens to fail on the first trial (maybe quota / provisioning issues) and then MLOS crashes after due to our user scripts expecting a result metric, trying to start the same experiment again gets pandas indexing errors for missing columns at bulk_register.

image

In such cases, _adjust_signs_df gets called with potentially empty data frames, or ones that do not contain the target opt columns yet because of failed runs.

Then this line will fail with indexing errors:
image

Maybe the dataframe can be created first before adjusting signs, and then checked if the target columns exist

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions