You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It consist of three main steps: Get Pipeline ID, Trigger ML Training Pipeline, and Publish artifact.
The first steps always goes through.
The second step sometimes works and sometimes crashes. It crashes in the evaluation step because it does not get a MSE value. However, this happens randomly. Without changing the code, the pipeline might suddenly throw an error.
Until now it also always fails at the third step, precisely at the "Determine if evaluation succeeded" step. Until now I could determine that the error occurs in the "automobile-publish-model-artifact-template.yml" file, in line "FOUND_MODEL=$(az ml model list -g $(RESOURCE_GROUP) --workspace-name $(WORKSPACE_NAME) --tag BuildId=$(Build.BuildId) --query '[0]')". I could not work out the error further, as the pipeline randomly fails for longer periods of time in the second step.
The text was updated successfully, but these errors were encountered:
I just did a repro, and the issue I found was that the code in evaluate_model.py is looking at mse, but the model, trained with lightgbm, only publishes the auc and f1 score as metrics.
Updating the code in that file from mse to auc an inverting the sign of the comparison (to check which model is better) made it work. Otherwise, I had the same error you describe.
I am working on the CI step of the pipeline.
It consist of three main steps: Get Pipeline ID, Trigger ML Training Pipeline, and Publish artifact.
The first steps always goes through.
The second step sometimes works and sometimes crashes. It crashes in the evaluation step because it does not get a MSE value. However, this happens randomly. Without changing the code, the pipeline might suddenly throw an error.
Until now it also always fails at the third step, precisely at the "Determine if evaluation succeeded" step. Until now I could determine that the error occurs in the "automobile-publish-model-artifact-template.yml" file, in line "FOUND_MODEL=$(az ml model list -g$(RESOURCE_GROUP) --workspace-name $ (WORKSPACE_NAME) --tag BuildId=$(Build.BuildId) --query '[0]')". I could not work out the error further, as the pipeline randomly fails for longer periods of time in the second step.
The text was updated successfully, but these errors were encountered: