Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[experimental] Adding a CudaExperiment, and variables to Base and CudaExperiment #400

Draft
wants to merge 3 commits into
base: develop
Choose a base branch
from

Conversation

pearce8
Copy link
Collaborator

@pearce8 pearce8 commented Oct 14, 2024

  • Variables for Application and Spack section to go into Experiment
  • CudaExperiment to hold cuda-specific versions for the above
  • Application-specific experiment.py would then be class Amg2023(Experiment,CudaExperiment):
  • Can/should we define experiment.py for hypre, and use it in experiment.py for amg2023?

Copy link
Collaborator

@scheibelp scheibelp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall my comments fall into two categories

  • Some discuss how you could potentially make this inheritance-based approach work
  • Some highlight the potential issues with an inheritance-based approach and discuss an alternative

2. **Experiment instances**. Once instantiated, an experiment is
essentially a collection of files defining an experiment in a
Ramble workspace.
"""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This docstring is copied from Experiment and doesn't explain anything new or specific to this class.

variants: Dict[
"benchpark.spec.Spec",
Dict[str, benchpark.variant.Variant],
]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should not have to instantiate this here. Why does this inherit from ExperimentSystemBase vs. Experiment? If this is related to MRO concerns, see my later comment about using composition.

system_specs = {}
system_specs["cuda_version"] = "{default_cuda_version}"
system_specs["cuda_arch"] = "{cuda_arch}"
system_specs["blas"] = "cublas-cuda" #TODO: can we define this in Lapack/Blas CudaExperiment?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my experience this differs from system to system, and it's best handled by saying here just that you need blas; the system can then register whatever blas external is appropriate (see for example in Sierra definition: https://github.com/LLNL/benchpark/blob/develop/var/sys_repo/systems/sierra/externals/cuda/00-version-10-1-243-packages.yaml).


raise NotImplementedError(
"Each experiment must implement compute_spack_section"
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With an inheritance scheme, this would have to return something, e.g.

        return {
            "packages": {k: v for k, v in package_specs.items() if v},
            "environments": {app_name: {"packages": list(package_specs.keys())}},
        }

(copied from Amg2023) at which point, a subclass would want to do something like:

class FooExperiment(CudaExperiment):
  def compute_spack_section(self):
    config = super().compute_spack_section()
    # now add all the things specific to Foo

Using inheritance is somewhat inflexible, because everything you want to mix together would need to be part of a single inheritance chain (which is why I suggested composition in #356 (comment), for example).

@@ -70,12 +70,43 @@ def compute_modifiers_section(self):

def compute_applications_section(self):
# TODO: is there some reasonable default?
variables = {}
variables["n_ranks"] = num_procs # TODO: num_procs will be defined in the child...
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If that is so, this will need to be self.num_procs, and that will be a new required method for all subclasses (which can be checked by benchpark audit, FWIW).

"pkg_spec": f"{app_name}@{app_version} +mpi",
"compiler": system_specs["compiler"],
}

raise NotImplementedError(
"Each experiment must implement compute_spack_section"
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this does anything, it can't raise NotImplementedError. FWIW benchpark audit currently makes sure that the lowest-level definition (e.g. what lives in exp_repo) defines this method.

package_specs = {}
package_specs[system_specs["mpi"]] = (
{}
) # empty package_specs value implies external package
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a good candidate for a method in Experiment like

def needs_external(pkgs_dict, system_specs, pkg_name):
  pkgs_dict[system_specs[pkg_name]] = {}

(mainly so this comment doesn't have to be repeated).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants