feat: ✨ ClearML training loss logging #1844

odulcy-mindee · 2025-01-16T15:48:40Z

No description provided.

references/detection/train_pytorch.py

felixdittrich92

Thanks :)

codecov · 2025-01-16T18:29:53Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 96.61%. Comparing base (b0d2728) to head (f670db7).
Report is 3 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1844      +/-   ##
==========================================
+ Coverage   96.58%   96.61%   +0.02%     
==========================================
  Files         165      165              
  Lines        7940     7940              
==========================================
+ Hits         7669     7671       +2     
+ Misses        271      269       -2

Flag	Coverage Δ
unittests	`96.61% <ø> (+0.02%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

kuraga · 2025-01-17T12:41:55Z

Sorry, weren't there plans to make logging more universal?
(clearml_log argument name doesn't have a place for other loggers...)

felixdittrich92 · 2025-01-17T12:46:19Z

Hi @kuraga 👋🏼,

What do you mean with other loggers ?
Sry I'm a bit out of scope with the Clearml topic (still using W&B on my own 😅)
CC @odulcy-mindee

kuraga · 2025-01-17T12:50:32Z

@felixdittrich92 Hm... Is W&B exactly the answer? 😄

I like a lightweight lazyscribe BTW...

I don't ask "to implement other loggers" but about some abstract layer...

felixdittrich92 · 2025-01-17T13:31:21Z

@felixdittrich92 Hm... Is W&B exactly the answer? 😄

I like a lightweight lazyscribe BTW...

I don't ask "to implement other loggers" but about some abstract layer...

Ah got it I was thinking the question is related to something with ClearML 😅 For me personally W&B is still the answer xD But yeah everyone what they prefer

Mh yes I understood the request ..the thing is that the training scripts / references folder is more a "collection" of scripts everyone can modify to there own needs which has some advantages but on the other side also disadvantages (for example adding globally avaialble features shared between the different training tasks like logging, early stopping, and so on)

with an Trainer object like you know from the transformers lib for example it would be much easier to implement such features but on the other hand it would wrap everything, reduce the transparency and make it much harder for (experienced) users to transform the training process to there own needs 😅

kuraga · 2025-01-17T14:07:49Z

the training scripts / references folder is more a "collection" of scripts

Aaahh, didn't catch it. My fault. Training of docTR is Mindee's inner job at most. Thanks, @felixdittrich92 :)

P.S. Add READMEs to /references, etc. :)

felixdittrich92 · 2025-01-17T14:11:10Z

Mh not directly we use the same scripts for pre-training all the models but users can use it also for fine-tuning on there own needs / own datasets :)

The inner task specific folders have already README's :)
For example: https://github.com/mindee/doctr/tree/main/references/recognition

felixdittrich92 · 2025-01-17T14:13:14Z

And the corresponding doc: https://mindee.github.io/doctr/latest/using_doctr/custom_models_training.html

kuraga · 2025-01-17T14:21:10Z

Well, if the mechanism is public I'll illustrate my

I don't ask "to implement other loggers" but about some abstract layer...

: the clearml_log: bool argument could be logger: logging.Logger :)

kuraga · 2025-01-17T19:20:53Z

Update: hhm, now I see clearml.Logger does not inherit logging.Logger...

ThanhNX0401 · 2025-01-18T04:38:48Z

Is there any reason why docTR don't output the Training loss log to WB? How can i tell if the model is overfit or underfit?

felixdittrich92 · 2025-01-20T07:06:46Z

Is there any reason why docTR don't output the Training loss log to WB? How can i tell if the model is overfit or underfit?

Oh you are right we should do this for every train script the same way it's done with the clearml_log passed to the train function. Would you like to open a PR ?

felixdittrich92 · 2025-01-23T20:20:04Z

@ThanhNX0401 Fixed now on main branch 😊

feat: ✨ ClearML training loss logging

559475b

odulcy-mindee requested a review from felixdittrich92 January 16, 2025 15:48

felixdittrich92 requested changes Jan 16, 2025

View reviewed changes

references/detection/train_pytorch.py Outdated Show resolved Hide resolved

Felix, the man who can find a needle in a haystack

f670db7

odulcy-mindee requested a review from felixdittrich92 January 16, 2025 16:27

felixdittrich92 approved these changes Jan 16, 2025

View reviewed changes

odulcy-mindee merged commit 38efc1b into mindee:main Jan 16, 2025
67 checks passed

odulcy-mindee deleted the clearml_support branch January 16, 2025 17:01

felixdittrich92 added this to the 0.11.0 milestone Jan 17, 2025

felixdittrich92 linked an issue Jan 17, 2025 that may be closed by this pull request

[references] Add slack logging & fix clearml integration #1458

Closed

2 tasks

felixdittrich92 mentioned this pull request Jan 17, 2025

[references] Add slack logging & fix clearml integration #1458

Closed

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: ✨ ClearML training loss logging #1844

feat: ✨ ClearML training loss logging #1844

odulcy-mindee commented Jan 16, 2025

felixdittrich92 left a comment

codecov bot commented Jan 16, 2025 •

edited

Loading

kuraga commented Jan 17, 2025

felixdittrich92 commented Jan 17, 2025

kuraga commented Jan 17, 2025

felixdittrich92 commented Jan 17, 2025

kuraga commented Jan 17, 2025

felixdittrich92 commented Jan 17, 2025 •

edited

Loading

felixdittrich92 commented Jan 17, 2025

kuraga commented Jan 17, 2025

kuraga commented Jan 17, 2025

ThanhNX0401 commented Jan 18, 2025

felixdittrich92 commented Jan 20, 2025

felixdittrich92 commented Jan 23, 2025

feat: ✨ ClearML training loss logging #1844

feat: ✨ ClearML training loss logging #1844

Conversation

odulcy-mindee commented Jan 16, 2025

felixdittrich92 left a comment

Choose a reason for hiding this comment

codecov bot commented Jan 16, 2025 • edited Loading

Codecov Report

kuraga commented Jan 17, 2025

felixdittrich92 commented Jan 17, 2025

kuraga commented Jan 17, 2025

felixdittrich92 commented Jan 17, 2025

kuraga commented Jan 17, 2025

felixdittrich92 commented Jan 17, 2025 • edited Loading

felixdittrich92 commented Jan 17, 2025

kuraga commented Jan 17, 2025

kuraga commented Jan 17, 2025

ThanhNX0401 commented Jan 18, 2025

felixdittrich92 commented Jan 20, 2025

felixdittrich92 commented Jan 23, 2025

codecov bot commented Jan 16, 2025 •

edited

Loading

felixdittrich92 commented Jan 17, 2025 •

edited

Loading