Skip to content

Commit

Permalink
Consolidate Manifests
Browse files Browse the repository at this point in the history
Manifests.toml are a burden to maintain and add a lot of git diff noise.
The one benefit of Manifests.toml is that they help with tracking
version dependencies.

However:
- For `docs`, the precise version of dependencies is not important,
there is pretty much no version-dependent code being executed there
- `perf` and `examples` can be consolidated into a single `.buildkite`
environment. This environment only has to have one manifest checked, the
one for which buildkite is run (and not one for each version of Julia).
- The `perf` environment can be removed all together. It is a more niche
environment: if reproducing a specific result is required, developers
can use the `.buildkite` Manifest. If not, developers can also consider
keeping the relevant tools (mainly JET) in the base environment.

This is the same pattern we have in other repos such as ClimaCore.
  • Loading branch information
Sbozzolo committed Jan 27, 2025
1 parent 45855b0 commit a315d0e
Show file tree
Hide file tree
Showing 13 changed files with 252 additions and 13,939 deletions.
File renamed without changes.
50 changes: 22 additions & 28 deletions .buildkite/gpu_pipeline/pipeline.yml
Original file line number Diff line number Diff line change
Expand Up @@ -20,19 +20,13 @@ steps:
key: "init_gpu_env"
command:
- echo "--- Instantiate examples"
- julia --project=examples -e 'using Pkg; Pkg.instantiate(;verbose=true)'
- julia --project=examples -e 'using Pkg; Pkg.precompile()'
- julia --project=examples -e 'using CUDA; CUDA.precompile_runtime()'
- julia --project=examples -e 'using Pkg; Pkg.status()'

- echo "--- Instantiate perf"
- julia --project=perf -e 'using Pkg; Pkg.instantiate(;verbose=true)'
- julia --project=perf -e 'using Pkg; Pkg.precompile()'
- julia --project=perf -e 'using CUDA; CUDA.precompile_runtime()'
- julia --project=perf -e 'using Pkg; Pkg.status()'
- julia --project=.buildkite -e 'using Pkg; Pkg.instantiate(;verbose=true)'
- julia --project=.buildkite -e 'using Pkg; Pkg.precompile()'
- julia --project=.buildkite -e 'using CUDA; CUDA.precompile_runtime()'
- julia --project=.buildkite -e 'using Pkg; Pkg.status()'

- echo "--- Download artifacts"
- julia --project=examples artifacts/download_artifacts.jl
- julia --project=.buildkite artifacts/download_artifacts.jl

agents:
slurm_gpus: 1
Expand All @@ -52,7 +46,7 @@ steps:
- mkdir -p target_gpu_implicit_baroclinic_wave
- >
nsys profile --delay 100 --trace=nvtx,mpi,cuda,osrt --output=target_gpu_implicit_baroclinic_wave/output_active/report
julia --threads=3 --color=yes --project=examples examples/hybrid/driver.jl
julia --threads=3 --color=yes --project=.buildkite examples/hybrid/driver.jl
--config_file ${GPU_CONFIG_PATH}target_gpu_implicit_baroclinic_wave.yml
--job_id target_gpu_implicit_baroclinic_wave
artifact_paths: "target_gpu_implicit_baroclinic_wave/output_active/*"
Expand All @@ -69,7 +63,7 @@ steps:
- mkdir -p gpu_hs_rhoe_equil_55km_nz63_0M
- >
nsys profile --delay 100 --trace=nvtx,mpi,cuda,osrt --output=gpu_hs_rhoe_equil_55km_nz63_0M/output_active/report
julia --threads=3 --color=yes --project=examples examples/hybrid/driver.jl
julia --threads=3 --color=yes --project=.buildkite examples/hybrid/driver.jl
--config_file ${GPU_CONFIG_PATH}gpu_hs_rhoe_equil_0M.yml
--job_id gpu_hs_rhoe_equil_55km_nz63_0M
artifact_paths: "gpu_hs_rhoe_equil_55km_nz63_0M/output_active/*"
Expand All @@ -87,7 +81,7 @@ steps:
- >
srun --cpu-bind=threads --cpus-per-task=4
nsys profile --delay 100 --trace=nvtx,mpi,cuda,osrt --output=gpu_hs_rhoe_equil_55km_nz63_0M_4process/output_active/report-%q{PMI_RANK}
julia --threads=3 --color=yes --project=examples examples/hybrid/driver.jl
julia --threads=3 --color=yes --project=.buildkite examples/hybrid/driver.jl
--config_file ${GPU_CONFIG_PATH}gpu_hs_rhoe_equil_0M.yml
--job_id gpu_hs_rhoe_equil_55km_nz63_0M_4process
artifact_paths: "gpu_hs_rhoe_equil_55km_nz63_0M_4process/output_active/*"
Expand All @@ -107,7 +101,7 @@ steps:
- >
srun --cpu-bind=threads --cpus-per-task=4
nsys profile --delay 100 --trace=osrt,nvtx,cuda,mpi,ucx --output=target_gpu_implicit_baroclinic_wave_4process/output_active/report-%q{PMI_RANK}
julia --threads=3 --color=yes --project=examples examples/hybrid/driver.jl
julia --threads=3 --color=yes --project=.buildkite examples/hybrid/driver.jl
--config_file ${GPU_CONFIG_PATH}target_gpu_implicit_baroclinic_wave.yml
--job_id target_gpu_implicit_baroclinic_wave_4process
artifact_paths: "target_gpu_implicit_baroclinic_wave_4process/output_active/*"
Expand All @@ -128,7 +122,7 @@ steps:
- mkdir -p gpu_aquaplanet_dyamond_diag_1process
- >
srun --cpu-bind=threads --cpus-per-task=4
nsys profile --delay 100 --trace=nvtx,mpi,cuda,osrt --output=gpu_aquaplanet_dyamond_diag_1process/output_active/report julia --threads=3 --color=yes --project=examples examples/hybrid/driver.jl
nsys profile --delay 100 --trace=nvtx,mpi,cuda,osrt --output=gpu_aquaplanet_dyamond_diag_1process/output_active/report julia --threads=3 --color=yes --project=.buildkite examples/hybrid/driver.jl
--config_file ${GPU_CONFIG_PATH}gpu_aquaplanet_dyamond_diag_1process.yml
--job_id gpu_aquaplanet_dyamond_diag_1process
artifact_paths: "gpu_aquaplanet_dyamond_diag_1process/output_active/*"
Expand All @@ -149,7 +143,7 @@ steps:
- >
srun --cpu-bind=threads --cpus-per-task=4
nsys profile --delay 100 --trace=nvtx,mpi,cuda,osrt --output=gpu_aquaplanet_dyamond_ss_1process/output_active/report
julia --threads=3 --color=yes --project=examples examples/hybrid/driver.jl
julia --threads=3 --color=yes --project=.buildkite examples/hybrid/driver.jl
--config_file ${GPU_CONFIG_PATH}gpu_aquaplanet_dyamond_ss.yml
--job_id gpu_aquaplanet_dyamond_ss_1process
artifact_paths: "gpu_aquaplanet_dyamond_ss_1process/output_active/*"
Expand All @@ -169,7 +163,7 @@ steps:
- mkdir -p gpu_aquaplanet_dyamond_ss_2process
- >
srun --cpu-bind=threads --cpus-per-task=4
julia --threads=3 --color=yes --project=examples examples/hybrid/driver.jl
julia --threads=3 --color=yes --project=.buildkite examples/hybrid/driver.jl
--config_file ${GPU_CONFIG_PATH}gpu_aquaplanet_dyamond_ss.yml
--job_id gpu_aquaplanet_dyamond_ss_2process
artifact_paths: "gpu_aquaplanet_dyamond_ss_2process/output_active/*"
Expand All @@ -189,7 +183,7 @@ steps:
- mkdir -p gpu_aquaplanet_dyamond_ss_4process
- >
srun --cpu-bind=threads --cpus-per-task=4
julia --threads=3 --color=yes --project=examples examples/hybrid/driver.jl
julia --threads=3 --color=yes --project=.buildkite examples/hybrid/driver.jl
--config_file ${GPU_CONFIG_PATH}gpu_aquaplanet_dyamond_ss.yml
--job_id gpu_aquaplanet_dyamond_ss_4process
artifact_paths: "gpu_aquaplanet_dyamond_ss_4process/output_active/*"
Expand All @@ -210,7 +204,7 @@ steps:
command:
- mkdir -p gpu_aquaplanet_dyamond_ss
- >
julia --color=yes --project=examples post_processing/plot_gpu_strong_scaling.jl gpu_aquaplanet_dyamond_ss
julia --color=yes --project=.buildkite post_processing/plot_gpu_strong_scaling.jl gpu_aquaplanet_dyamond_ss
artifact_paths: "gpu_aquaplanet_dyamond_ss/*"
env:
CLIMACOMMS_CONTEXT: "MPI"
Expand All @@ -227,7 +221,7 @@ steps:
- mkdir -p gpu_aquaplanet_dyamond_ws_1process
- >
srun --cpu-bind=threads --cpus-per-task=4
julia --threads=3 --color=yes --project=examples examples/hybrid/driver.jl
julia --threads=3 --color=yes --project=.buildkite examples/hybrid/driver.jl
--config_file ${GPU_CONFIG_PATH}gpu_aquaplanet_dyamond_ws_1process.yml
--job_id gpu_aquaplanet_dyamond_ws_1process
artifact_paths: "gpu_aquaplanet_dyamond_ws_1process/output_active/*"
Expand All @@ -247,7 +241,7 @@ steps:
- mkdir -p gpu_aquaplanet_dyamond_ws_2process
- >
srun --cpu-bind=threads --cpus-per-task=4
julia --threads=3 --color=yes --project=examples examples/hybrid/driver.jl
julia --threads=3 --color=yes --project=.buildkite examples/hybrid/driver.jl
--config_file ${GPU_CONFIG_PATH}gpu_aquaplanet_dyamond_ws_2process.yml
--job_id gpu_aquaplanet_dyamond_ws_2process
artifact_paths: "gpu_aquaplanet_dyamond_ws_2process/output_active/*"
Expand All @@ -267,7 +261,7 @@ steps:
- mkdir -p gpu_aquaplanet_dyamond_ws_4process
- >
srun --cpu-bind=threads --cpus-per-task=4
julia --threads=3 --color=yes --project=examples examples/hybrid/driver.jl
julia --threads=3 --color=yes --project=.buildkite examples/hybrid/driver.jl
--config_file ${GPU_CONFIG_PATH}gpu_aquaplanet_dyamond_ws_4process.yml
--job_id gpu_aquaplanet_dyamond_ws_4process
artifact_paths: "gpu_aquaplanet_dyamond_ws_4process/output_active/*"
Expand All @@ -288,7 +282,7 @@ steps:
command:
- mkdir -p gpu_aquaplanet_dyamond_ws
- >
julia --color=yes --project=examples post_processing/plot_gpu_weak_scaling.jl gpu_aquaplanet_dyamond_ws
julia --color=yes --project=.buildkite post_processing/plot_gpu_weak_scaling.jl gpu_aquaplanet_dyamond_ws
artifact_paths: "gpu_aquaplanet_dyamond_ws/*"
env:
CLIMACOMMS_DEVICE: "CUDA"
Expand All @@ -308,7 +302,7 @@ steps:
- mkdir -p gpu_aquaplanet_diagedmf
- >
nsys profile --delay 200 --trace=nvtx,mpi,cuda,osrt --output=gpu_aquaplanet_diagedmf/output_active/report
julia --threads=3 --color=yes --project=examples examples/hybrid/driver.jl
julia --threads=3 --color=yes --project=.buildkite examples/hybrid/driver.jl
--config_file ${MODEL_CONFIG_PATH}aquaplanet_diagedmf.yml
--job_id gpu_aquaplanet_diagedmf
artifact_paths: "gpu_aquaplanet_diagedmf/output_active/*"
Expand All @@ -324,7 +318,7 @@ steps:

- label: "gpu_aquaplanet_diagedmf_benchmark"
command: >
julia --color=yes --project=perf perf/benchmark.jl
julia --color=yes --project=.buildkite perf/benchmark.jl
--config_file ${MODEL_CONFIG_PATH}aquaplanet_diagedmf.yml
--job_id gpu_aquaplanet_diagedmf_benchmark
artifact_paths: "gpu_aquaplanet_diagedmf_benchmark/output_active/*"
Expand All @@ -342,7 +336,7 @@ steps:
- mkdir -p gpu_aquaplanet_progedmf
- >
nsys profile --delay 100 --trace=nvtx,mpi,cuda,osrt --output=gpu_aquaplanet_progedmf/output_active/report
julia --threads=3 --color=yes --project=examples examples/hybrid/driver.jl
julia --threads=3 --color=yes --project=.buildkite examples/hybrid/driver.jl
--config_file ${MODEL_CONFIG_PATH}aquaplanet_progedmf.yml
--job_id gpu_aquaplanet_progedmf
artifact_paths: "gpu_aquaplanet_progedmf/output_active/*"
Expand All @@ -358,7 +352,7 @@ steps:

- label: "gpu_aquaplanet_progedmf_benchmark"
command: >
julia --color=yes --project=perf perf/benchmark.jl
julia --color=yes --project=.buildkite perf/benchmark.jl
--config_file ${MODEL_CONFIG_PATH}aquaplanet_progedmf.yml
--job_id gpu_aquaplanet_progedmf_benchmark
artifact_paths: "gpu_aquaplanet_progedmf_benchmark/output_active/*"
Expand Down
Loading

0 comments on commit a315d0e

Please sign in to comment.