-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Milavision package as a drop-in replacement to torchvision
#15
Conversation
Signed-off-by: Fabrice Normandin <[email protected]>
Signed-off-by: Fabrice Normandin <[email protected]>
Signed-off-by: Fabrice Normandin <[email protected]>
Signed-off-by: Fabrice Normandin <[email protected]>
Signed-off-by: lebrice <[email protected]>
Signed-off-by: lebrice <[email protected]>
Signed-off-by: Fabrice Normandin <[email protected]>
Signed-off-by: lebrice <[email protected]>
Signed-off-by: lebrice <[email protected]>
Signed-off-by: lebrice <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made a first pass at a code review. I haven't read the core logic attentively, so I might have more comments next week.
LOCAL = enum.auto() | ||
MILA = enum.auto() | ||
CEDAR = enum.auto() | ||
BELUGA = enum.auto() | ||
GRAHAM = enum.auto() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure why you're using an enum instead of something like
if socket.getfqdn().endswith(".server.mila.quebec"):
from . import mila as current
elif socket.getfqdn().endswith(".computecanada.ca"): # ?
from . import cc as current
...
I am going to note one thing here, which is that if this package is successful, a natural evolution of the idea is that other institutions than Mila may be interested in adding support for their own clusters (the enum might become quite large). That would be good, because if external code starts using this library for their own reasons, that is more code that plays nice on our cluster. This is one reason why I think that organizing code to group all code for a particular cluster together is a smart play.
|
||
@property | ||
def fast_data_dir(self) -> Path: | ||
""" Returns the 'fast' directory where datasets are stored for quick read/writtes. """ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
""" Returns the 'fast' directory where datasets are stored for quick read/writtes. """ | |
""" Returns the 'fast' directory where datasets are stored for quick read/writes. """ |
if self is ClusterType.LOCAL: | ||
return Path("data") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if we should have a default for LOCAL at all.
if typing.TYPE_CHECKING: | ||
# Import these here, so that when writing code, you get hints when doing something like: | ||
# | ||
# from milavision.datasets import caltech | ||
# | ||
# without the type-checker complaining. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean, if you're going to paste the whole list of imports here regardless, why not just do it unconditionally and get rid of the logic above it? If we are worried about missing things in newer versions of torchvision, this can be tested for and checked by the CI.
Now this is a bit of a philosophical point, but I think it would be better to explicitly not allow importing datasets from milavision that we do not support. It's expectation management: if I can import something from milavision, I expect it to do the optimal thing on the Mila cluster. If it turns out we don't support it, I think I would prefer to get an exception. Then I know that okay I have to use the default version and I have to be careful.
Do tell if you see use cases where it is important to have full coverage within the same import.
def on_login_node() -> bool: | ||
# IDEA: Detect if we're on a login node somehow. | ||
return socket.getfqdn().endswith(".server.mila.quebec") and "SLURM_TMPDIR" not in os.environ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I like that, it could prevent some accidents!
Alternatively we can check if the fqdn starts with login
, but checking for SLURM_TMPDIR is also a good idea. It might be better to check if it is empty rather than just whether it is in the environment, because if it is the empty string this means we will use the cwd as a tmpdir which is bad.
if on_login_node(): | ||
raise RuntimeError(f"Don't run this on a login node, you fool!") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
😂
_IGNORED = "" | ||
|
||
|
||
def make_dataset(dataset_type: Type[VD], root: str, *agrs, **kwargs) -> VD: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def make_dataset(dataset_type: Type[VD], root: str, *agrs, **kwargs) -> VD: | |
def make_dataset(dataset_type: Type[VD], root: str, *args, **kwargs) -> VD: |
(ditto for uses of the variable below)
Signed-off-by: Fabrice Normandin <[email protected]>
This whole idea was moved to a separate project: https://www.github.com/mila-iqia/mila_datamodules |
Please ignore the other closed PR.
Here's my current take on #13.
Let me know what you think :)
NOTE: I will frequently push to this branch, since I alternate between my local setup and the mila cluster to test stuff out. Sorry in advance for all the notifications. Feel free to just ignore them if you want.
Signed-off-by: Fabrice Normandin [email protected]