Experimental command line interface UX #6135

dantegd · 2024-11-13T05:02:22Z

No description provided.

…assmethod

…ments

viclafargue

Good job! Just have a few comments. I focused my review on the estimator proxy and the interception function.

viclafargue · 2024-11-14T10:24:26Z

python/cuml/cuml/internals/base.pyx

@@ -471,6 +477,45 @@ class Base(TagsMixin,
                func = nvtx_annotate(message=msg, domain="cuml_python")(func)
                setattr(self, func_name, func)

+    @classmethod
+    def _hyperparam_translator(cls, **kwargs):


I think that the function could be simplified. Also edge case, but what happens when the result of the translation is supposed to be an "accept" or "dispatch" string?

I also think we could improve this a bit.

Before looking at how it was done in this PR my expectation was to find something like this:

class Base: def _translate_hyperparameters(self, foo="some value", **hyper_parameters): if foo != "some value": hyper_parameters["foo"] = "some value" return hyper_parameters class A(Base): def _translate_hyperparameters(self, bar=True, **hyper_parameters): hyper_parameters = super()._translate_hyperparameters(**hyper_parameters) if not bar: raise NotImplemented(f"Can't dispatch with '{bar=}', has to be True.") return hyper_parameters

So that we can accumulate translations from Base upwards, save on having the extra layer of the translation dictionary

(I need to think a bit more/try out the code, so bugs ahead but right now I'm channeling my best inner Raymond Hettinger: "there must be a better way" :D)

@betatim the reason to avoid a method per class like that is to avoid a ton of extra code per algorithm, the dictionary is actually based on scikit-learn's parameter constraints just simplified for now,

@viclafargue not sure I understand the edge case? what happens when the result of the translation is supposed to be an "accept" or "dispatch" string?

Let's say there's an hyperparameter called nan_values_handling. It can take the value "allow" in cuML, but its equivalency in sklearn is "accept". Is there a way to write a dictionary that does the translation?

Can we just use an Enum for accept and dispatch? To me it feels brittle if there's some estimator in the future that uses "accept" string as a valid hyperparam value.

I'm not sure we have to worry about the case where the value is "accept" or "dispatch". The way I understand the dictionary like

_hyperparam_interop_translator = { "solver": { "lbfgs": "qn", "liblinear": "qn", "newton-cg": "qn", "newton-cholesky": "qn", "sag": "qn", "saga": "qn" }, }

is that it lists those hyper-parameters that needs translating and for those it lists the values and their translations. For example solver="lbfgs" needs translating to solver="qn", but say foobar=42 won't need translating because foobar isn't listed. Similarly solver="dazzle" doesn't need translating because it isn't listed. This makes me think we don't really need an entry with a value of "accept", we could just have no entry ("accept" is the default assumption).

That is the "accept" magic value dealt with. Then there is the case of "dispatch" which is used for parameter values that can't be translated. It means "use scikit-learn". I think we should replace it by something like NotImplemented or some other exception or similar. This would deal with the case where a cuml value for a parameter should be "dispatch" - here we aren't able to tell if we should use scikit-learn or use cuml with this value. Hence lets use something like NotImplemented.

The parameter values used in scikit-learn don't matter because they are keys in the dictionaries.

Having slept on my comment I've changed my mind. I still think having a method on each class that modifies the parameters as it sees fit could be nice. Not sure if it will create more typing/work than using a dict ¯_(ツ)_/¯.

I also pondered the _hyperparam_translator function and I think this is how I'd write it. It doesn't use my suggestion from above (NotImplemented) but it could. The main changes are that it isn't a class method anymore (I couldn't work out why it was one), it merged the base classes translations with those of the derived class (I assume those are the only two interesting ones, we don't need to merge them together from the whole inheritance tree), I removed the if cases that were for "accept" (I think we don't do anything in those cases other than 👍 )

def _hyperparam_translator(self, **kwargs): """ This method is meant to do checks and translations of hyperparameters at estimator creating time. Each children estimator can override the method, returning either modifier **kwargs with equivalent options, or """ gpuaccel = True # Copy it so we can modify it translations = dict(super()._hyperparam_interop_translator) # Allow the derived class to overwrite the base class translations.update(self._hyperparam_interop_translator) for parameter_name, value in kwargs.items(): # maybe clean up using: translations.get(parameter_name, {}).get(value, None)? if parameter_name in translations: if value in translations[parameter_name]: if translations[parameter_name][value] == "dispatch": gpuaccel = False else: kwargs[arg] = translations[parameter_name][value] return kwargs, gpuaccel

One more thought on "dispatch": if we don't replace it with NotImplemented or similar, can we use use_cpu or something? I can't keep it straight in my head what "dispatch" means in the various libraries (most mean "use a GPU" when they talk about dispatching, in cuml it means "use a CPU"). Maybe something explicit like "use_cpu" makes it easier to reason about what is happening (thought my preference is still NotImplemented or similar).

viclafargue · 2024-11-14T10:24:59Z

python/cuml/cuml/internals/base.pyx


        # GPU case
-        if device_type == DeviceType.device:
+        if device_type == DeviceType.device or func_name not in ['fit', 'fit_transform', 'fit_predict']:


Could you explain the change?

viclafargue · 2024-11-14T10:26:31Z

python/cuml/cuml/manifold/umap.pyx

@@ -234,7 +234,7 @@ class UMAP(UniversalBase,
        are returned when transform is called on the same data upon
        which the model was trained. This enables consistent
        behavior between calling ``model.fit_transform(X)`` and
-        calling ``model.fit(X).transform(X)``. Not that the CPU-based
+        calling ``model.fit(X).transform(X)``. Note that the CPU-based


Thanks for fixing this one ^^

viclafargue · 2024-11-14T10:50:16Z

python/cuml/cuml/experimental/accel/estimator_proxy.py

+            )
+            super().__init__()
+            self.import_cpu_model()
+            self._cpu_model = self._cpu_model_class()


It would be preferable to instantiate with hyperparameters if available. Aren't they stored as attributes? Why not call build_cpu_model? Maybe we should add _full_kwargs to the state dict before serialization?

viclafargue · 2024-11-14T10:51:52Z

python/cuml/cuml/experimental/accel/estimator_proxy.py

+            self.output_type = "numpy"
+            self.output_mem_type = MemoryType.host


Could you explain?

viclafargue · 2024-11-14T12:40:02Z

python/cuml/cuml/experimental/accel/estimator_proxy.py

+# currently we just use this dictionary for debugging purposes
+patched_classes = {}


Do we plan to keep this for merge?

If so, we should make it threadsafe.

viclafargue · 2024-11-14T12:46:01Z

python/cuml/cuml/experimental/accel/estimator_proxy.py

+            with disable_module_accelerator():
+                filename = self.__class__.__name__ + "_sklearn"
+                with open(filename, "wb") as f:
+                    pickle.dump(self._cpu_model_class, f)


Could you explain this part?

Also, can pickle.dumps be used instead?

I'm also interested in understand why this is here/what it does. If we really have to write to a file it should probably be a tempfile.

But ideally we don't have to do that?

The whole story around pickling warrants some developer documentation I think

viclafargue · 2024-11-14T12:51:16Z

python/cuml/cuml/experimental/accel/_wrappers/hdbscan.py

+from ..estimator_proxy import intercept
+
+
+UMAP = intercept(


wphicks

Finished first half of the review but would like to continue going over the rest more carefully from here.

wphicks · 2024-11-14T19:44:41Z

python/cuml/cuml/experimental/accel/__init__.py

+
+from .magics import load_ipython_extension
+
+# from .profiler import Profiler


Let's remove the comment.

wphicks · 2024-11-14T19:49:27Z

python/cuml/cuml/experimental/accel/__init__.py

+__all__ = ["load_ipython_extension", "install"]
+
+
+LOADED = False


Is this threadsafe? My initial instinct is that this requires protection by a lock, but maybe there is some reason why this wouldn't be an issue?

wphicks · 2024-11-14T19:51:10Z

python/cuml/cuml/experimental/accel/_wrappers/sklearn.py

+    original_class_name="DBSCAN",
+)
+
+# HDBSCAN = intercept(


Could we add a comment explaining why the commented-out algorithms are disabled here?

I'd be in favour of the proposal from your other comment: remove commented out code and code that does nothing. We can easily add it back (from the git commit history) if we want to

wphicks · 2024-11-14T19:53:31Z

python/cuml/cuml/experimental/accel/estimator_proxy.py

+# currently we just use this dictionary for debugging purposes
+patched_classes = {}


If so, we should make it threadsafe.

wphicks · 2024-11-14T19:54:06Z

python/cuml/cuml/experimental/accel/estimator_proxy.py

+            self._cpu_model_class = (
+                original_class_a  # Store a reference to the original class
+            )
+            # print("HYPPPPP")


Suggested change

# print("HYPPPPP")

wphicks · 2024-11-14T20:17:38Z

python/cuml/cuml/experimental/accel/estimator_proxy.py

+            )
+
+        def __setstate__(self, state):
+            print(f"state: {state}")


Suggested change

print(f"state: {state}")

wphicks · 2024-11-14T21:25:59Z

python/cuml/cuml/experimental/accel/fast_slow_proxy.py

+# limitations under the License.
+#
+
+from __future__ import annotations


Let's remove any unused code from this file.

wphicks · 2024-11-14T23:43:52Z

python/cuml/cuml/experimental/accel/module_accelerator.py

+            frame = sys._getframe()
+            # We cannot possibly be at the top level.
+            assert frame.f_back
+            calling_module = pathlib.PurePath(frame.f_back.f_code.co_filename)


Have we profiled to understand the performance implications of this mechanism? Inspecting the frame seems like something we should do only if we really have to. In the context of cuML, we're already tracking whether or not we're internal to the cuML API, so do we need to use this?

wphicks · 2024-11-14T23:45:02Z

python/cuml/cuml/experimental/accel/module_accelerator.py

+                    f".._wrappers.{mode.slow_lib}", __name__
+                )
+            try:
+                (self,) = (


Why the tuple unpacking here?

wphicks · 2024-11-14T23:47:51Z

python/cuml/cuml/experimental/accel/module_accelerator.py

+            return self
+
+
+def disable_module_accelerator() -> contextlib.ExitStack:


As a follow-on, we should be able to make this much faster using our global settings object. Not critical for the initial merge though.

divyegala · 2024-11-15T01:06:05Z

python/cuml/cuml/experimental/accel/__main__.py

+@click.command()
+@click.option("-m", "module", required=False, help="Module to run")
+@click.option(
+    "--profile",


Can we add a warning here that this is an unused parameter at the moment?

How about removing parameters that at the moment do nothing? If we want to add them (with functionality) it is easy enough to do. And it keeps things tidy, both for the reader of the help message and the reader of the code

divyegala · 2024-11-15T01:06:15Z

python/cuml/cuml/experimental/accel/__main__.py

+    help="Perform per-function profiling of this script.",
+)
+@click.option(
+    "--line-profile",


Same as above

divyegala · 2024-11-15T01:08:09Z

python/cuml/cuml/experimental/accel/magics.py

+    # from .profiler import Profiler, lines_with_profiling
+
+    # @magics_class
+    # class CumlAccelMagic(Magics):
+    #     @cell_magic("cuml.accelerator.profile")
+    #     def profile(self, _, cell):
+    #         with Profiler() as profiler:
+    #             get_ipython().run_cell(cell)  # noqa: F821
+    #         profiler.print_per_function_stats()
+
+    #     @cell_magic("cuml.accelerator.line_profile")
+    #     def line_profile(self, _, cell):
+    #         new_cell = lines_with_profiling(cell.split("\n"))
+    #         get_ipython().run_cell(new_cell)  # noqa: F821


divyegala · 2024-11-15T01:10:26Z

python/cuml/cuml/internals/base.pyx

@@ -471,6 +477,45 @@ class Base(TagsMixin,
                func = nvtx_annotate(message=msg, domain="cuml_python")(func)
                setattr(self, func_name, func)

+    @classmethod
+    def _hyperparam_translator(cls, **kwargs):


Can we just use an Enum for accept and dispatch? To me it feels brittle if there's some estimator in the future that uses "accept" string as a valid hyperparam value.

hcho3

No major blockers

hcho3 · 2024-11-15T05:30:34Z

python/cuml/cuml/internals/base.pyx

+
+            # else:
+            #     gpuaccel = False


Why is this line commented out?

hcho3 · 2024-11-15T05:50:27Z

python/cuml/cuml/experimental/accel/__init__.py

+    )
+
+
+def pytest_load_initial_conftests(early_config, parser, args):


So the no-code change magic kicks in when running the pytest suite? Very cool.

By the way, does it affect the other pytests that are outside cuml.experimental.accel? Many of our existing tests assert that cuML algorithm matches output of sklearn's counterpart.

betatim · 2024-11-15T08:41:02Z

I ran the following small snippet to see things in action, but I'm now puzzled about whether or not cuml was used. Is there an easy way to tell (assume I'm a simple minded user who isn't going to dig into the cuml codebase)?

import cuml.experimental.accel
cuml.experimental.accel.install()

from sklearn.datasets import make_blobs
from sklearn.cluster import KMeans


X, y = make_blobs()
km = KMeans()

km.fit(X, y)
print(f"{km.cluster_centers_=}")
print(km.score(X, y))

This outputs the following:

Installing cuML Accelerator...
[I] [08:33:03.570100] Non Estimator Function Dispatching disabled...
[I] [08:33:03.605120] Non Estimator Function Dispatching disabled...
[I] [08:33:03.607562] Non Estimator Function Dispatching disabled...
km.cluster_centers_=array([[ 8.51813728,  0.89449653],
       [ 5.36304509, -9.09408513],
       [-1.06137904,  6.52824416],
       [ 7.0920223 , -1.11348216],
       [ 6.98095313, -8.23207799],
       [ 6.79229768, -9.76694763],
       [ 0.20774067,  7.58842924],
       [ 6.71965882,  1.64106257]])
-85.08620849985817

I was expecting to see either a log message saying "This was run on the GPU!" (or something similarly positive and simple) or as an alternative something like what I proposed in scikit-image where we issue a DispatchNotification (via the warning system) that lets people know code was run differently from how it would have been without the dispatching enabled.

The second thing I thought might tell me if it was dispatched was inspecting a fitted attribute, though I guess cuml array works hard to make that hard :-/

betatim · 2024-11-15T09:48:20Z

python/cuml/cuml/experimental/accel/estimator_proxy.py

+                original_class_a  # Store a reference to the original class
+            )
+            # print("HYPPPPP")
+            kwargs, self._gpuaccel = self._hyperparam_translator(**kwargs)


We need to handle things like KMeans(8) as well. So not just translating keyword arguments but also the (few?) occasions where positional arguments are allowed for the scikit-learn estimator.

Right now kwargs = {'args': 8} when you instantiate KMeans(8) like this. That seems definitely not what we want :D

I think we can use the result of inspect.signature(self._cpu_model_class).bind(*args, **kwargs).arguments to get a dictionary that contains everything the user passed.

To get all arguments, including defaults that the user didn't pass:

signature = inspect.signature(self._cpu_model_class).bind(*args, **kwargs) signature.apply_defaults() print(signature.arguments)

I think this is what we need to pass to the translator

I had never seen this KMeans(8), what is the expected behavior here? I thought non positional arguments except for X and y were not allowed?

betatim · 2024-11-15T12:28:34Z

python/cuml/cuml/experimental/accel/estimator_proxy.py

+        def __repr__(cls):
+            return repr(original_class_a)
+
+    class ProxyEstimator(class_b, metaclass=ProxyEstimatorMeta):


What was the thinking around making this inherit from class_b (the cuml class right?)? Naively I'd have made this a class that doesn't inherit from either class A or class B, but has an instance of each stored as an attribute and then proxies to the appropriate one.

The next thing I'd have tried is inheriting from the sklearn class, in the hopes of making isinstance checks "just work". Though I think I'd probably gone back to using attributes because the asymmetry for proxying A and B would have confused me.

Looking at this class it mostly does hyper-parameter translation and then deals with serialisation. I think this is because the cuml class that we inherit from already has all the methods we need and they get called directly. Which might be the answer to my original question?

The machinery of dispatching lives in the cuML class, otherwise we would need to recreate dispatching here when we already have in cuML, as you mention at the end indeed. So essentially the ProxyEstimator is just an extension to the existing cuML estimators that have functionality for cuml-cpu, so there could be an argument in the future that we could just upstream the functionality here to Base as we refactor our Python codebase.

betatim · 2024-11-15T12:29:33Z

python/cuml/cuml/experimental/accel/estimator_proxy.py

+
+
+def reconstruct_proxy(original_module, new_module, class_name_a, args, kwargs):
+    "Function needed to pickle since ProxyEstimator is"


Needs proper triple quotes and it seems the sentence ended mid sentence.

betatim · 2024-11-15T13:27:12Z

In general I think we can fix/change most things here after people start trying it.

These are things I'd fix before:

remove commented out code and print statements
fix docstring formatting, triple quotes, grammar, etc (IMHO not nicely done docstrings are like having a messy workshop, it doesn't mean the mechanic is less good but the first impression is less good)
deal with things like KMeans(8) so that we don't skip parameters by accident. Also why does it show up as args?
Add a log message or dispatch notification (via the warnings system) to let people know "Congratulations, your code is running on a GPU! Time to celebrate!" - given this is all about making people use GPUs I think making sure that it is 120% clear to users that they just got accelerated
clean up the existing log messages. Either by making them more detailed or removing them for now

dantegd and others added 18 commits October 7, 2024 17:40

ENH Make get_param_names a class method to match Scikit-learn

d63e222

ENH Make get_param_names a class method in cython files too

4a1a7bc

FIX remove changes to dask estimators

9eb0255

Merge branch 'branch-24.12' into 2412-fix-classmethod

1086f75

Style fixes

abf81bd

FIX self to cls in qn.pyx

f43e580

FIX final typo fix hopefully

c902164

Merge remote-tracking branch 'upstream/branch-24.12' into 2412-fix-cl…

00c14d6

…assmethod

passing tests

d46c31d

Merge branch 'branch-24.12' into 2412-fix-classmethod

24a8047

Update ESTIMATOR_GUIDE.md

009546c

FEA First commit

9c3edce

FIX typo

502886b

FIX fixes and improvements

5c441a3

FIX typo

15ac83d

ENH Hyperparameter translation improvements

7b8fdd4

Merge branch 'branch-24.12' into fea-cli

33c6a7e

ENH Simplification of proxy estimator code and multiple fixes/improve…

20c345b

…ments

github-actions bot added the Cython / Python Cython or Python issue label Nov 13, 2024

viclafargue reviewed Nov 14, 2024

View reviewed changes

wphicks reviewed Nov 14, 2024

View reviewed changes

divyegala reviewed Nov 15, 2024

View reviewed changes

hcho3 reviewed Nov 15, 2024

View reviewed changes

betatim reviewed Nov 15, 2024

View reviewed changes

		self.output_type = "numpy"
		self.output_mem_type = MemoryType.host

		# currently we just use this dictionary for debugging purposes
		patched_classes = {}


		from .magics import load_ipython_extension

		# from .profiler import Profiler

		__all__ = ["load_ipython_extension", "install"]


		LOADED = False

		return self


		def disable_module_accelerator() -> contextlib.ExitStack:

		)


		def pytest_load_initial_conftests(early_config, parser, args):



		def reconstruct_proxy(original_module, new_module, class_name_a, args, kwargs):
		"Function needed to pickle since ProxyEstimator is"

Experimental command line interface UX #6135

Are you sure you want to change the base?

Experimental command line interface UX #6135

Conversation

dantegd commented Nov 13, 2024

viclafargue left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

betatim Nov 14, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wphicks left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hcho3 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

betatim commented Nov 15, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dantegd Nov 15, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

betatim commented Nov 15, 2024

betatim Nov 14, 2024 •

edited

Loading

betatim commented Nov 15, 2024 •

edited

Loading

dantegd Nov 15, 2024 •

edited

Loading