Allow specifying metric_definitions on ModelTrainer #5018

straygar · 2025-02-06T10:59:45Z

Describe the feature you'd like
Similar to the Estimator abstraction, I would like to make use of SageMaker's CloudWatch metrics, based on job logs.

How would this feature be used? Please describe.
Not sure. Either an explicit argument in model_trainer or allowing the user to specify a create_job_args dictionary, to work around the abstraction if some API feature is not exposed:

trainer = ModelTrainer(
  ...,
  metric_definitions={
    {"Name": "training_iteration", "Regex": "Iteration (.+), Loss .+,"},
  }
)

(we could also just get rid of Name and Regex and just have a dict[MetricName, Pattern] there)

Describe alternatives you've considered
Using the Estimator class. I recently moved away from it, as the ModelTrainer abstraction makes more sense to me and other scientists/engineers on my team.

Additional context
n/a

The text was updated successfully, but these errors were encountered:

mufaddal-rohawala added the component: pysdk-team Related to SageMaker Python SDK Core Issues label Feb 6, 2025

benieric mentioned this issue Feb 14, 2025

feat: Add support for MetricDefinitions in ModelTrainer and update docs #5040

Open

10 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow specifying metric_definitions on ModelTrainer #5018

Allow specifying metric_definitions on ModelTrainer #5018

straygar commented Feb 6, 2025

Allow specifying metric_definitions on ModelTrainer #5018

Allow specifying metric_definitions on ModelTrainer #5018

Comments

straygar commented Feb 6, 2025