Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: ✨ ClearML training loss logging #1844

Merged
merged 2 commits into from
Jan 16, 2025

Conversation

odulcy-mindee
Copy link
Collaborator

No description provided.

Copy link
Contributor

@felixdittrich92 felixdittrich92 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks :)

@odulcy-mindee odulcy-mindee merged commit 38efc1b into mindee:main Jan 16, 2025
67 checks passed
@odulcy-mindee odulcy-mindee deleted the clearml_support branch January 16, 2025 17:01
Copy link

codecov bot commented Jan 16, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 96.61%. Comparing base (b0d2728) to head (f670db7).
Report is 3 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1844      +/-   ##
==========================================
+ Coverage   96.58%   96.61%   +0.02%     
==========================================
  Files         165      165              
  Lines        7940     7940              
==========================================
+ Hits         7669     7671       +2     
+ Misses        271      269       -2     
Flag Coverage Δ
unittests 96.61% <ø> (+0.02%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@felixdittrich92 felixdittrich92 added this to the 0.11.0 milestone Jan 17, 2025
@felixdittrich92 felixdittrich92 added ext: references Related to references folder framework: pytorch Related to PyTorch backend framework: tensorflow Related to TensorFlow backend topic: text detection Related to the task of text detection topic: text recognition Related to the task of text recognition topic: character classification Related to the task of character classification type: new feature New feature labels Jan 17, 2025
@felixdittrich92 felixdittrich92 linked an issue Jan 17, 2025 that may be closed by this pull request
2 tasks
@kuraga
Copy link

kuraga commented Jan 17, 2025

Sorry, weren't there plans to make logging more universal?
(clearml_log argument name doesn't have a place for other loggers...)

@felixdittrich92
Copy link
Contributor

Hi @kuraga 👋🏼,

What do you mean with other loggers ?
Sry I'm a bit out of scope with the Clearml topic (still using W&B on my own 😅)
CC @odulcy-mindee

@kuraga
Copy link

kuraga commented Jan 17, 2025

@felixdittrich92 Hm... Is W&B exactly the answer? 😄

I like a lightweight lazyscribe BTW...

I don't ask "to implement other loggers" but about some abstract layer...

@felixdittrich92
Copy link
Contributor

@felixdittrich92 Hm... Is W&B exactly the answer? 😄

I like a lightweight lazyscribe BTW...

I don't ask "to implement other loggers" but about some abstract layer...

Ah got it I was thinking the question is related to something with ClearML 😅 For me personally W&B is still the answer xD But yeah everyone what they prefer

Mh yes I understood the request ..the thing is that the training scripts / references folder is more a "collection" of scripts everyone can modify to there own needs which has some advantages but on the other side also disadvantages (for example adding globally avaialble features shared between the different training tasks like logging, early stopping, and so on)

with an Trainer object like you know from the transformers lib for example it would be much easier to implement such features but on the other hand it would wrap everything, reduce the transparency and make it much harder for (experienced) users to transform the training process to there own needs 😅

@kuraga
Copy link

kuraga commented Jan 17, 2025

the training scripts / references folder is more a "collection" of scripts

Aaahh, didn't catch it. My fault. Training of docTR is Mindee's inner job at most. Thanks, @felixdittrich92 :)

P.S. Add READMEs to /references, etc. :)

@felixdittrich92
Copy link
Contributor

felixdittrich92 commented Jan 17, 2025

Mh not directly we use the same scripts for pre-training all the models but users can use it also for fine-tuning on there own needs / own datasets :)

The inner task specific folders have already README's :)
For example: https://github.com/mindee/doctr/tree/main/references/recognition

@felixdittrich92
Copy link
Contributor

@kuraga
Copy link

kuraga commented Jan 17, 2025

Well, if the mechanism is public I'll illustrate my

I don't ask "to implement other loggers" but about some abstract layer...

: the clearml_log: bool argument could be logger: logging.Logger :)

@kuraga
Copy link

kuraga commented Jan 17, 2025

Update: hhm, now I see clearml.Logger does not inherit logging.Logger...

@ThanhNX0401
Copy link

Is there any reason why docTR don't output the Training loss log to WB? How can i tell if the model is overfit or underfit?

@felixdittrich92
Copy link
Contributor

Is there any reason why docTR don't output the Training loss log to WB? How can i tell if the model is overfit or underfit?

Oh you are right we should do this for every train script the same way it's done with the clearml_log passed to the train function. Would you like to open a PR ?

@felixdittrich92
Copy link
Contributor

@ThanhNX0401 Fixed now on main branch 😊

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ext: references Related to references folder framework: pytorch Related to PyTorch backend framework: tensorflow Related to TensorFlow backend topic: character classification Related to the task of character classification topic: text detection Related to the task of text detection topic: text recognition Related to the task of text recognition type: new feature New feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[references] Add slack logging & fix clearml integration
4 participants