Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support str(datamodule) #9947

Open
carmocca opened this issue Oct 15, 2021 · 11 comments · May be fixed by #20301
Open

Support str(datamodule) #9947

carmocca opened this issue Oct 15, 2021 · 11 comments · May be fixed by #20301
Assignees
Labels
data handling Generic data-related topic feature Is an improvement or enhancement good first issue Good for newcomers
Milestone

Comments

@carmocca
Copy link
Contributor

🚀 Feature

Add support

print(str(MyDataModule()))

Motivation

It currently prints:

<__main__.MyDataModule object at 0x10284c970>

Pitch

It could print the DataLoader structure:

MyDataModule(
    train_dataloader: {"a": DataLoaderClass(batch_size=8, num_batches=16, num_workers=2), "b":  DataLoaderClass(batch_size=2, num_batches=16, num_workers=2)]
    val_dataloader: [DataLoaderClass(batch_size=3, num_batches=14, num_workers=0), DataLoaderClass(batch_size=8, num_batches=4, num_workers=0)]
    test_dataloader: DataLoaderClass(batch_size=4, num_batches=7, num_workers=2)
)

Or the number of batches per dataloader, similar to what was done in #5965

Alternatives

Open to other ideas


If you enjoy Lightning, check out our other projects! ⚡

  • Metrics: Machine learning metrics for distributed, scalable PyTorch applications.

  • Flash: The fastest way to get a Lightning baseline! A collection of tasks for fast prototyping, baselining, finetuning and solving problems with deep learning

  • Bolts: Pretrained SOTA Deep Learning models, callbacks and more for research and production with PyTorch Lightning and PyTorch

  • Lightning Transformers: Flexible interface for high performance research using SOTA Transformers leveraging Pytorch Lightning, Transformers, and Hydra.

@carmocca carmocca added feature Is an improvement or enhancement good first issue Good for newcomers data handling Generic data-related topic labels Oct 15, 2021
@carmocca carmocca added this to the v1.6 milestone Oct 15, 2021
@carmocca
Copy link
Contributor Author

cc @kingyiusuen

@Abelarm
Copy link
Contributor

Abelarm commented Oct 15, 2021

I could take care of this 👍

@kingyiusuen
Copy link
Contributor

cc @kingyiusuen

I am happy to let @Abelarm take it :)

@Abelarm
Copy link
Contributor

Abelarm commented Oct 16, 2021

Hi guys I am currently at a problem between:
Screenshot 2021-10-16 at 20 01 05

and

Screenshot 2021-10-16 at 20 20 09 *

*which is not consistent on the prints.

the problem is the str() of the dict :(

Do you have any idea? or one of the two solutions is good enough?

@Abelarm
Copy link
Contributor

Abelarm commented Oct 16, 2021

Hi guys I am currently at a problem between: Screenshot 2021-10-16 at 20 01 05

and

Screenshot 2021-10-16 at 20 20 09 *

*which is not consistent on the prints.

the problem is the str() of the dict :(

Do you have any idea? or one of the two solutions is good enough?

if you really want the keys of the dict to be with "" I can do it but it won't be the nicest of the solutions

@carmocca
Copy link
Contributor Author

Hey @Abelarm! You can open a draft PR so we can check your current implementation and discuss it.

Abelarm added a commit to Abelarm/pytorch-lightning that referenced this issue Oct 16, 2021
@dmarx
Copy link

dmarx commented Oct 28, 2021

in the spirit of https://docs.python.org/3.4/reference/datamodel.html#object.__repr__

If at all possible, this should look like a valid Python expression that could be used to recreate an object with the same value (given an appropriate environment).

I recommend:

  1. keeping the quotes around dict keys but not dict values
  2. using an = after the name of initialization parameters instead of a :

Following these recommendations, @Abelarm 's test expression would become:

MyDataModule(
    train_dataloader={"a": DataLoaderClass(batch_size=8, num_batches=16, num_workers=2), "b":  DataLoaderClass(batch_size=2, num_batches=16, num_workers=2)]
    val_dataloader=[DataLoaderClass(batch_size=3, num_batches=14, num_workers=0), DataLoaderClass(batch_size=8, num_batches=4, num_workers=0)]
    test_dataloader=DataLoaderClass(batch_size=4, num_batches=7, num_workers=2)
)

@tchaton
Copy link
Contributor

tchaton commented Nov 1, 2021

Hey @carmocca,

I believe adding support for str() provides the same inconvenient as using len().

It might be worth to consider a describe LightningDataModule method instead.

Best,
T.C

@carmocca
Copy link
Contributor Author

carmocca commented Nov 2, 2021

The main reason for the revertion of len was the impact to existing truthiness checks. That should not be a problem for str.

@ananthsub do you think the rest of the points you raised in #5965 (comment) are worth dropping this feature? We would still have the problem of initialization.

@Abelarm
Copy link
Contributor

Abelarm commented Nov 18, 2021

in the spirit of https://docs.python.org/3.4/reference/datamodel.html#object.__repr__

If at all possible, this should look like a valid Python expression that could be used to recreate an object with the same value (given an appropriate environment).

I recommend:

  1. keeping the quotes around dict keys but not dict values
  2. using an = after the name of initialization parameters instead of a :

Following these recommendations, @Abelarm 's test expression would become:

MyDataModule(
    train_dataloader={"a": DataLoaderClass(batch_size=8, num_batches=16, num_workers=2), "b":  DataLoaderClass(batch_size=2, num_batches=16, num_workers=2)]
    val_dataloader=[DataLoaderClass(batch_size=3, num_batches=14, num_workers=0), DataLoaderClass(batch_size=8, num_batches=4, num_workers=0)]
    test_dataloader=DataLoaderClass(batch_size=4, num_batches=7, num_workers=2)
)

in my pr I already go : instead of = but I am struggling to add "" around keys dict :(

@carmocca carmocca modified the milestones: 1.6, future Feb 1, 2022
@MrWhatZitToYaa
Copy link

It seems like this feature is still not implemented. Would it be possible to work in this issue?

@MrWhatZitToYaa MrWhatZitToYaa linked a pull request Sep 25, 2024 that will close this issue
7 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
data handling Generic data-related topic feature Is an improvement or enhancement good first issue Good for newcomers
Projects
None yet
6 participants