Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improve task and pin models by supporting resources with units and making a lot optional #36

Merged
merged 9 commits into from
Jan 15, 2025

Conversation

trusch
Copy link
Collaborator

@trusch trusch commented Dec 16, 2024

This tries to improve our model layer, especially for tasks and pins as those two are really end-user facing.

  • I made all optional things optional, so no more env: [] to make it parse
  • The whole metadata field is optional
  • resource units can be raw numbers or strings with units
    • cpus: 1000mcpu works the same as cpus: 1000, but also get stuff like time: 3h
    • time parsing can be improved by using humantime crate, right now its a rather simple implementation
  • Pins have CID and fallback_url optional, but parsing has now a validation step where it checks if any of this two is set
  • I added some tests to illustrate and test the changes

@trusch trusch requested a review from tuommaki December 16, 2024 12:28
@trusch
Copy link
Collaborator Author

trusch commented Dec 16, 2024

Note: This is actually a good place to add more user data validation, so we catch it early and reject it. For example we could check if the image is a properly formatted URI or other stuff like the check if the CID or the fallback url ist set etc.

@trusch trusch requested review from aleasims and miax-gevu December 16, 2024 12:38
src/models.rs Outdated

#[derive(Debug, Serialize, Deserialize, PartialEq, Eq)]
#[serde(untagged)]
pub enum NumberOrString {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initially seeing the name of this type I thought "why not use Either?" (that being familiar from Haskell).

Then looking at this implementation I'm thinking that maybe the naming should be a bit more specific like ComputeUnit perhaps? Also: Should this maybe implement FromStr ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I renamed it and also added FromStr, so now "1024kB".parse::<ComputeUnit>() works :)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is now ByteUnit, CoreUnit and TimeUnit

src/models.rs Outdated
Comment on lines 718 to 720
let str = serde_json::from_value::<NumberOrString>(json!("60min")).unwrap();
assert_eq!(str.as_number(), Some(60 * 60));
assert_eq!(str.as_string(), "60min");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Especially with time (but possibly with storage space as well), I'd maybe test also combinations like "1h 47min 3 sec". Would "3kb 3M" parse ? What about "7min 8gpus" 🙈

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"7min 8gpus" is probably something for the future but for time we can use humantime crate which does exactly what you describe

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it then make sense to split different types (CPUs/GPUs, MEM, Time) and possibly have an enum including them, with some brute force parsing logic trying each one (or w/ regex match based heuristics)?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When removing the time cases (which I did), the rest of the units isnt that much, so I think its fine like this. We can always extend it, but I feel it does its job well.

@trusch
Copy link
Collaborator Author

trusch commented Jan 8, 2025

@tuommaki Would you have another look at this when you find time?

Copy link
Contributor

@tuommaki tuommaki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this is fine now. There are some semantic issues (such as mgpu) still, I think, but maybe we can ignore them for a bit as long as we don't document & advertise use of them.

@trusch trusch merged commit 979a6ee into main Jan 15, 2025
1 check passed
@trusch trusch deleted the feature/better-models branch January 15, 2025 13:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants