-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consolidate manifests #3547
Consolidate manifests #3547
Conversation
8797e16
to
a315d0e
Compare
636928b
to
206c33e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm sorry about the merge conflict, I thought I had merged this before #3551, but I was mistaken.
This looks good to me, feel free to merge after rebasing, and thanks for updating this!
Manifests.toml are a burden to maintain and add a lot of git diff noise. The one benefit of Manifests.toml is that they help with tracking version dependencies. However: - For `docs`, the precise version of dependencies is not important, there is pretty much no version-dependent code being executed there - `perf` and `examples` can be consolidated into a single `.buildkite` environment. This environment only has to have one manifest checked, the one for which buildkite is run (and not one for each version of Julia). - The `perf` environment can be removed all together. It is a more niche environment: if reproducing a specific result is required, developers can use the `.buildkite` Manifest. If not, developers can also consider keeping the relevant tools (mainly JET) in the base environment. This is the same pattern we have in other repos such as ClimaCore.
206c33e
to
624be38
Compare
I went ahead and rebased! |
Includes #3548 |
3529619
to
3932f19
Compare
5bb811c
to
e97cee6
Compare
d2646d2
to
499f419
Compare
@@ -1,16 +1,17 @@ | |||
moist: "equil" | |||
apply_limiter: true | |||
z_elem: 25 | |||
z_elem: 11 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this changing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's because this job failed: https://buildkite.com/clima/climaatmos-ci/builds/22496#0194af81-d116-4343-92f6-9b753e5763aa, which is likely related to the climatimestepper update.
You changed the wrong config file though I think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, ok. I think it might be better to just increase the mem request
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using low resolution is not really meaningful for performance monitoring jobs, so I think it'd be better to remove it instead of lowering the resolution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's fine with me. I prefer to make it soft fail though so we will remember to come back to fix this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, ok. I think it might be better to just increase the mem request
The memory is GPU memory, so we would need to request a different type of GPU, which would increase the queue time significantly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed it to a soft fail
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good. I think there is something wrong, though. The fieldvector is being allocated here, and it's definitely not that large, so I don't know why we're OOMing. Maybe the GPU is not freeing memory as it should be.
499f419
to
062d71c
Compare
062d71c
to
5048573
Compare
It passed everything in the merge queue, but somehow the format check didn't start. |
I have an idea of what it might be. Let me start by merging this manually |
@Sbozzolo why is there a Manifest in the .buildkite/ directory but no Project? |
It was probably deleted by accident. We do need one becuase:
I'll open a PR. |
Manifests.toml are a burden to maintain and add a lot of git diff noise. The one benefit of Manifests.toml is that they help with tracking version dependencies.
However:
docs
, the precise version of dependencies is not important, there is pretty much no version-dependent code being executed thereperf
andexamples
can be consolidated into a single.buildkite
environment. This environment only has to have one manifest checked, the one for which buildkite is run (and not one for each version of Julia).perf
environment can be removed all together. It is a more niche environment: if reproducing a specific result is required, developers can use the.buildkite
Manifest. If not, developers can also consider keeping the relevant tools (mainly JET) in the base environment.This is the same pattern we have in other repos such as ClimaCore.
(Builds off #3546)
Closes #3548