[@card] Realtime cards (1/N) #1550

valayDave · 2023-09-26T22:20:29Z

Realtime Cards

PR Stack

Interface changes to current.card and machinery needed to render the cards / data-updates from the user code ([@card] Realtime cards (1/N) #1550) [Review-able]
Metaflow Card UI Code changes to support real-time updates of components ([@card] Realtime cards (2/N) #1551) [Review-able]
Make internal MetaflowCard and some MetaflowCardComponents real-time updatable ([@card] Realtime cards (3/N) #1552) [Review-able]
Introduce a card server command to launch a local card server that will allow viewing cards in "realtime" ([@card] Realtime cards (4/N) #1553) [WIP]

Metaflow UI Changes PRs

Expose API in the Metaflow UI Service [[realtime-cards] backend API Methods metaflow-service#402]
Make Client side changes in the Netflix/metaflow-ui

Implementation Memo

Key Changes in the PR

Introduce the current.card.refresh interface for rendering / pushing data updates
Introduce the current.card.components interface to update components
Introduce the MetaflowCardComponent.update interface
Introduce the MetaflowCard.render_runtime and MetaflowCard.refresh interfaces to render cards during runtime and handle the logic for data updates
Lift card creation logic outside the card decorator so it can be called independently even if multiple decorators are given to the same card. We needed to change this because the current.card object is a singleton that is only instantiated once per task and is responsible for catching runtime rendering calls. Since all the card rendering logic was inside the instance of a decorator, the individual instance of a decorator was bound to only a single card. This PR lifts that logic into a separate abstraction called CardCreator

valayDave · 2023-09-26T22:49:30Z

metaflow/plugins/cards/card_creator.py

Abstraction introduced to ensure that card creation logic can be safely handled by the CardComponentCollector class

valayDave · 2023-09-26T22:50:26Z

metaflow/plugins/cards/card_datastore.py

@@ -88,15 +89,15 @@ def __init__(self, flow_datastore, pathspec=None):
        self._temp_card_save_path = self._get_write_path(base_pth=TEMP_DIR_NAME)

    @classmethod
-    def get_card_location(cls, base_path, card_name, card_html, card_id=None):
-        chash = sha1(bytes(card_html, "utf-8")).hexdigest()


Changing from using hash to uuids since hashes will change for cards rendered during runtime.

valayDave · 2023-09-26T22:51:54Z

metaflow/plugins/cards/card_decorator.py

+            self.attributes,
+            self.card_options,
+            editable=self._is_editable,
+            customize=customize,
+            runtime_card=self._is_runtime_card,


Additional arguments are provided to _add_card so that they can be piped down when rendering a new card using the CardCreator class.

valayDave · 2023-09-26T22:52:34Z

metaflow/plugins/cards/card_decorator.py

-        if len(component_strings) > 0:
-            temp_file = tempfile.NamedTemporaryFile("w", suffix=".json")
-            json.dump(component_strings, temp_file)
-            temp_file.seek(0)
-        executable = sys.executable
-        cmd = [
-            executable,
-            sys.argv[0],
-        ]
-        cmd += self._create_top_level_args() + [
-            "card",
-            "create",
-            runspec,
-            "--type",
-            self.attributes["type"],
-            # Add the options relating to card arguments.
-            # todo : add scope as a CLI arg for the create method.
-        ]
-        if self.card_options is not None and len(self.card_options) > 0:
-            cmd += ["--options", json.dumps(self.card_options)]
-        # set the id argument.
-
-        if self.attributes["timeout"] is not None:
-            cmd += ["--timeout", str(self.attributes["timeout"])]
-
-        if self._user_set_card_id is not None:
-            cmd += ["--id", str(self._user_set_card_id)]
-
-        if self.attributes["save_errors"]:
-            cmd += ["--render-error-card"]
-
-        if temp_file is not None:
-            cmd += ["--component-file", temp_file.name]
-
-        response, fail = self._run_command(
-            cmd, os.environ, timeout=self.attributes["timeout"]
-        )
-        if fail:
-            resp = "" if response is None else response.decode("utf-8")
-            self._logger(
-                "Card render failed with error : \n\n %s" % resp,
-                timestamp=False,
-                bad=True,
-            )
-
-    def _run_command(self, cmd, env, timeout=None):
-        fail = False
-        timeout_args = {}
-        if timeout is not None:
-            timeout_args = dict(timeout=int(timeout) + 10)
-        try:
-            rep = subprocess.check_output(
-                cmd, env=env, stderr=subprocess.STDOUT, **timeout_args
-            )
-        except subprocess.CalledProcessError as e:
-            rep = e.output
-            fail = True
-        except subprocess.TimeoutExpired as e:
-            rep = e.output
-            fail = True
-        return rep, fail


All of this logic got lifted into card_creator.py

valayDave · 2023-09-26T22:53:02Z

metaflow/plugins/cards/card_modules/card.py

+    # RELOAD_POLICY determines whether UIs should
+    # reload intermediate cards produced by render_runtime
+    # or whether they can just rely on data updates
+
+    # the UI may keep using the same card
+    # until the final card is produced
+    RELOAD_POLICY_NEVER = "never"
+
+    # the UI should reload card every time
+    # render_runtime() has produced a new card
+    RELOAD_POLICY_ALWAYS = "always"
+
+    # derive reload token from data and component
+    # content - force reload only when the content
+    # changes. The actual policy is card-specific,
+    # defined by the method reload_content_token()
+    RELOAD_POLICY_ONCHANGE = "onchange"
+
+    # this token will get replaced in the html with a unique
+    # string that is used to ensure that data updates and the
+    # card content matches
+    RELOAD_POLICY_TOKEN = "[METAFLOW_RELOAD_TOKEN]"


Important note on how card reload is expected to behave.

valayDave · 2023-09-26T22:54:28Z

metaflow/plugins/cards/component_serializer.py

 class WarningComponent(ErrorComponent):
    def __init__(self, warning_message):
        super().__init__("@card WARNING", warning_message)


+class ComponentStore:


Added this abstraction to handle under-the-hood manipulation of components when users call current.card.components

valayDave · 2023-09-26T22:56:03Z

metaflow/plugins/cards/component_serializer.py

+    ## Usage Patterns :
+
+    ```python
+    current.card["mycardid"].append(component, id="comp123")
+    current.card["mycardid"].extend([component])
+    current.card["mycardid"].refresh(data) # refreshes the card with new data
+    current.card["mycardid"].components["comp123"] # returns the component with id "comp123"
+    current.card["mycardid"].components["comp123"].update()
+    current.card["mycardid"].components.clear() # Wipe all the components
+    del current.card["mycardid"].components["mycomponentid"] # Delete a component
+    current.card["mycardid"].components["mynewcomponent"] = Markdown("## New Component") # Set a new component


Usage patterns for current.card["abc"] or current.card

valayDave · 2023-09-26T22:58:44Z

metaflow/plugins/cards/component_serializer.py

+        return len(self._components)
+
+
+class CardComponentManager:


CardComponentCollector: manages all cards decorators present on a @step

CardComponentManager : manages the additon/remove/updates of individual components for a specific cards. This is returned when users call current.card["abc"]

ComponentStore : Abstraction to handle the actual operations (append/extend/updates) over the components when users access components through the CardComponentManager.components

valayDave · 2023-09-26T23:00:02Z

metaflow/plugins/cards/card_modules/card.py

+        return
+
+    # FIXME document
+    def refresh(self, task, data):


structure of data is given by the return value of CardComponentManager._get_latest_data

romain-intel

Not fully done reviewing. First pass. I'll circle back after I looked at all the PRs to get a better understanding instead of asking stupid questions first. SOme comments though.

romain-intel · 2023-10-02T20:53:54Z

metaflow/plugins/cards/card_modules/card.py

@@ -32,9 +32,34 @@ class MetaflowCard(object):
        JSON-encodable dictionary containing user-definable options for the class.
    """

+    # RELOAD_POLICY determines whether UIs should


I would rephrase this a bit. If I understood correctly, reload really refers to the "html" or basically the "non-data" part and refers to things like change of layout and what not. I would change the variable to something like FULL_RELOAD_POLICY or something that shows that it is not just about "reloading" data. It can be a bit confusing ow since there are actually two "loads:

the card itself (layout, etc)

the data.

The data is reloaded much more frequently.

metaflow/plugins/cards/card_modules/card.py

metaflow/plugins/cards/component_serializer.py

romain-intel · 2023-10-02T22:40:35Z

metaflow/plugins/cards/card_cli.py

+                return _add_token_html(mf_card.render(task, data))
+            else:
+                return _add_token_html(mf_card.render(task))
+        elif mode == "render_runtime":


nit: We should check for both that data is not None and has the appropriate field. It uses it in the reload_token.

Done. This function now robustly handles many cases. I have also added a comment that explains what it is doing. I also added more comments in the card create function that helps understand the behavior for different modes and failure scenarios.

romain-intel · 2023-10-02T23:04:56Z

metaflow/plugins/cards/card_cli.py

            except:
                if render_error_card:
                    error_stack_trace = str(UnrenderableCardException(type, options))
                else:
                    raise UnrenderableCardException(type, options)
        #

-    if error_stack_trace is not None:
+    if error_stack_trace is not None and mode != "refresh":


confused on the mode "refresh" vs "render_runtime"? Is this a typo?

No, It's not a typo. Currently, the card_cli command takes three modes:

refresh: Involves passing JSON data used for data updates (JSON shipped to S3)

render_runtime: Involves rendering the card during runtime (HTML shipped to S3)

render: Involves rendering the card after runtime has completed execution. (HTML shipped to S3)

If the mode is refresh, we need to ensure that we don't create error cards, which happens in the line below this line.

romain-intel · 2023-10-03T05:03:55Z

metaflow/plugins/cards/card_creator.py

+            data_file = tempfile.NamedTemporaryFile("w", suffix=".json", delete=False)
+            json.dump(data, data_file)
+            data_file.seek(0)
+


I would close both files here if we are done with them.

File deletion is taken care of by the cli using the --delete-input-files argument in card cli. So overall if the file wasn't closed, it shouldn't be an issue, since the tempfile will be out of scope once the function completes execution.

I would probably prefer a pattern like

with tempfile....: # do stuff

as that would ensure cleanup even if the subprocess has an issue and never gets to cleanup. Here it feels like there are more cases where files can leak and stick around. We could use a similar mechanism to the one in S3 if you wanted to keep them around for debugging.

More context here.

many of these commands will run in async mode, so if we use a context manager, then we may delete the file before:

the async process is created
or

the async process has read the file.

The card create cli command deletes the file right after reading it.

metaflow/plugins/cards/component_serializer.py

romain-intel · 2023-10-03T05:09:05Z

metaflow/plugins/cards/component_serializer.py

+    It exposes the `append` /`extend` methods like an array to add components.
+    It also exposes the `__getitem__`/`__setitem__` methods like a dictionary to access the components by thier Ids.
+
+    The reason this has dual behavior is because components cannot be stored entirely as a map/dictionary because


I believe that dicts are ordered per the language starting in py 3.7 (insertion order) so we may be able to simplify this.

That's great! It will certainly remove all the complexity of managing/updating two objects (a dict and list). We can keep the class as is but make its internal implementation much much simpler!

This hasn't changed yet right?

This hasn't changed; I think changing this will result in breaking functionality for anything < py37; Currently cards can work < py37

I think we are fine with new functionality not working for such old versions of python. This will actually work in python 3.6+ so it's really just 3.5. Up to you but it would simplify things a bit :) .

valayDave

I addressed @romain-intel 's comments and added a little more code. Here is the summary of all the new changes added to the code :

card_cli.py
- Refactored the update_card method to handle many different cases gracefully.
- The logic of the function is the same, it just gracefully handles the cases where render_runtime/refresh are not implemented
- The main logic responsible rendering error cards now robustly handles cases of timeout and missing implementation of the render_runtime method.
- Added more comments that help with the readability of the code.
card.py
- rename MetaflowCard.IS_RUNTIME_CARD to MetaflowCard.RUNTIME_UPDATABLE
- rename MetaflowCardComponent.id to MetaflowCardComponent.component_id
card_creator.py:
- fix bug for the edge case were async sidecars handling card creation may not have completed execution.
card_decorator.py:
- Refactor based on the rename of MetaflowCard.IS_RUNTIME_CARD to MetaflowCard.RUNTIME_UPDATABLE
component_serializer.py:
- Refactor based on the rename of MetaflowCardComponent.id to MetaflowCardComponent.component_id
- Remove the FIXME in the __setitem__ function to not allow overwriting to card components via current.card.components[ID] = NewComponent
- Add functionality to the current.card.refresh method that allows re-rendering cards when the layout of the card has changed. This mean that when new components are added to a card or when components are removed from a card then a re-render should be triggered. Modified the ComponentStore to expose a property that tells when the component array was last modified. The CardComponentManager uses this property in the ComponentStore to determine if a re-render should be triggered.
- Add checks to ensure that objects passed to current.card.refresh(data) are JSON serializable. If these are not JSON serializable then warn the user that they are not JSON serializable.
exception.py
- Introduce a new exception for when users try to over-write components.

valayDave · 2023-10-17T23:22:24Z

metaflow/plugins/cards/card_cli.py

+    # In the entire card rendering process, there are a few cases we want to handle:
+    # - [mode == "render"]
+    #   1. Card is rendered successfull (We store it in the datastore as a HTML file)
+    #   2. Card is not rendered successfully and we have --save-error-card flag set to True
+    #      (We store it in the datastore as a HTML file with stack trace)
+    #   3. Card render timeout and we have --save-error-card flag set to True
+    #      (We store it in the datastore as a HTML file with stack trace)
+    #   4. `render` returns nothing and we have --save-error-card flag set to True.
+    #       (We store it in the datastore as a HTML file with some message saying you returned nothing)
+    # - [mode == "render_runtime"]
+    #   1. Card is rendered successfully (We store it in the datastore as a HTML file)
+    #   2. `render_runtime` is not implemented but gets called and we have --save-error-card flag set to True.
+    #       (We store it in the datastore as a HTML file with some message saying the card should not be a runtime card if this method is not Implemented)
+    #   3. `render_runtime` is implemented and raises an exception and we have --save-error-card flag set to True.
+    #       (We store it in the datastore as a HTML file with stack trace)
+    #   4. `render_runtime` is implemented but returns nothing and we have --save-error-card flag set to True.
+    #       (We store it in the datastore as a HTML file with some message saying you returned nothing)
+    #   5. `render_runtime` is implemented but times out and we have --save-error-card flag set to True.
+    #       (We store it in the datastore as a HTML file with stack trace)
+    # - [mode == "refresh"]
+    #   1. Data update is created successfully (We store it in the datastore as a JSON file)
+    #   2. `refresh` is not implemented. (We do nothing. Don't store anything.)
+    #   3. `refresh` is implemented but it raises an exception. (We do nothing. Don't store anything.)
+    #   4. `refresh` is implemented but it times out. (We do nothing. Don't store anything.)
+


@romain-intel : since the code block below has a fair amounts of complexity in it; this comment is meant to simplify the understanding of its behavior. I did this because we now have more modes to the command than earlier.

valayDave · 2023-10-17T23:25:39Z

metaflow/plugins/cards/card_cli.py

+    if (
+        rendered_info.is_implemented
+        and rendered_info.timed_out
+        and mode != "refresh"
+        and render_error_card
+    ):
+        timeout_stack_trace = (
+            "\nCard rendering timed out after %s seconds. "
+            "To increase the timeout duration for card rendering, please set the `timeout` parameter in the @card decorator. "
+            "\nStack Trace : \n%s"
+        ) % (timeout, rendered_info.timeout_stack_trace)
+        rendered_content = error_card().render(
+            task,
+            stack_trace=timeout_stack_trace,
+        )
+    elif (
+        rendered_info.is_implemented
+        and rendered_info.data is None
+        and render_error_card
+        and mode != "refresh"
+    ):
+        rendered_content = error_card().render(
            task, stack_trace="No information rendered From card of type %s" % type
        )
+    elif (
+        not rendered_info.is_implemented
+        and render_error_card
+        and mode == "render_runtime"
+    ):
+        message = (
+            "Card of type %s is a runtime time card with no `render_runtime` implemented. "
+            "Please implement `render_runtime` method to allow rendering this card at runtime."
+        ) % type
+        rendered_content = error_card().render(task, stack_trace=message)


This is a tiny drive-by chore. This code block achieves the following:

If a card has timed out then provide a nicer visible message of what timed out and the stack trace of the timeout

If a card returned no information then inform the user that the render method returned nothing so nothing got rendered.

If a card that gets set to RUNTIME_UPDATABLE doesn't have a render_runtime method implemented then inform that to the user.

valayDave · 2023-10-17T23:26:21Z

metaflow/plugins/cards/card_client.py

+    def _get_data(self) -> Optional[dict]:
+        # currently an internal method to retrieve a card's data.
+        data_paths = self._card_ds.extract_data_paths(
+            card_type=self.type, card_hash=self.hash, card_id=self._card_id
+        )
+        if len(data_paths) == 0:
+            return None
+        return self._card_ds.get_card_data(data_paths[0])
+


The card client now has an internal method to get data updates. This will be useful when building the card server and modifying the metaflow UI codebase.

If it is going to be used by a card server/UI, I would make it public

valayDave · 2023-10-17T23:28:11Z

metaflow/plugins/cards/card_client.py

@@ -172,7 +181,7 @@ def _get_card(self, index):
        if index >= self._high:
            raise IndexError
        path = self._card_paths[index]
-        card_info = self._card_ds.card_info_from_path(path)
+        card_info = self._card_ds.info_from_path(path, suffix=CardNameSuffix.CARD)


drive-by function rename since it's now got a multi-faceted utility.

The info_from_path can be used to infer information from a path for data updates / cards. info_from_path provides a named tuple containing hash / id / type etc.

valayDave · 2023-10-17T23:29:04Z

metaflow/plugins/cards/card_creator.py

+            if _async_proc and _async_proc.poll() is None:
+                if time.time() - _async_started > async_timeout:
+                    CardProcessManager._remove_card_process(card_uuid)
+                    # Since we have removed the card process, we are free to run a new one
+                    # This will also ensure that when a old process is removed a new one is replaced.
+                    return self._run_command(
+                        cmd, card_uuid, env, wait=wait, timeout=timeout
+                    )


If a process took longer then timeout then kill the process and re-run the command that re-creates the process.

valayDave · 2023-10-17T23:46:56Z

metaflow/plugins/cards/card_modules/card.py

+    @property
+    def component_id(self):
+        return self._component_id
+
+    @component_id.setter
+    def component_id(self, value):
+        if not isinstance(value, str):
+            raise TypeError("id must be a string")
+        self._component_id = value
+


@romain-intel : Replaced id in favor of component_id

valayDave · 2023-10-18T00:57:55Z

metaflow/plugins/cards/component_serializer.py

+        if self._component_map.get(key) is not None:
+            # This is the equivalent of calling `current.card.components["mycomponent"] = Markdown("## New Component")`
+            # We don't support the replacement of individual components in the card.
+            # Instead we support rewriting the entire component array instead.
+            # So users can run `current.card[ID] = [FirstComponent, SecondComponent]` which will instantiate an entirely
+            # new ComponentStore.
+            # So we should throw an error over here, since it is clearly an operation which is not supported.
+            raise ComponentOverwriteNotSupportedException(
+                key, self._user_set_id, self._card_type
+            )
+        else:
+            self._store_component(value, component_id=key)


@tuulos : FYI

What's allowed :

current.card["mycard_id"] = [FirstComponent, SecondComponent]

Whats NOT allowed:

current.card["mycard_id"].components["mycomponent"] = NewComponent

If users try to over-write individual components in a card then we raise an Exception. The key reason for doing this and not throwing warnings is to make it explicit and avoid any behaviors that confuse users. Here is an example to explain its reasoning. Consider the following code block :

current.card.append(Markdown("hi"), id="A") current.card.components["A"].update("new Hi") current.card.components["A"] = Image(...) current.card.components["A"].update(new_image_bytes)

If we throw warnings instead of an exception, the above case will throw a TypeError or a ValueError since component A was still a markdown but the user assumed it was an image after setting current.card.components["A"] = Image(...). This may end up confusing the user. Instead, it's better to be completely unambiguous about supported behavior.

makes sense

valayDave · 2023-10-18T00:59:39Z

metaflow/plugins/cards/component_serializer.py

+        # This block of code will render the card in `render_runtime` mode when:
+        # 1. refresh is called with `force=True`
+        # 2. Layout of the components in the card has changed. i.e. The actual elements in the component array have changed.
+        # 3. The last time the card was rendered was more the minimum interval after which they should be rendered.
+        last_rendered_before_minimum_interval = (
+            nu - self._last_refresh
+        ) > RUNTIME_CARD_RENDER_INTERVAL
+        layout_has_changed = (
+            self._last_layout_change != self.components.layout_last_changed_on
+        )
+
+        if force or last_rendered_before_minimum_interval or layout_has_changed:
+            self._render_seq += 1
+            self._last_render = nu
+            self._card_proc("render_runtime")
+            self._last_layout_change = self.components.layout_last_changed_on
+        else:
+            self._card_proc("refresh")


Code logic that determines whether to call the card CLI in render_runtime mode or refresh mode.

valayDave · 2023-10-18T01:08:44Z

metaflow/plugins/cards/component_serializer.py

+    def components(self):
+        # FIXME: document
+        if self._default_editable_card is None:
+            if len(self._card_component_store) == 1:
+                card_type = list(self._cards_meta.values())[0]["type"]
+                _warning_msg = [
+                    "Card of type `%s` is not an editable card." % card_type,
+                    "Components list will not be updated and `current.card.components` will not work for any call during this runtime execution.",
+                    "Please use an editable card",  # todo : link to documentation
+                ]
+            else:
+                _warning_msg = [
+                    "`current.card.components` cannot disambiguate between multiple @card decorators.",
+                    "Components list will not be accessible and `current.card.components` will not work for any call during this runtime execution.",
+                    "To fix this set the `id` argument in all @card when using multiple @card decorators over a single @step and reference `current.card[ID].components`",  # todo : Add Link to documentation
+                    "to update/access the appropriate card component.",
+                ]
+            if not self._warned_once["components"]:
+                self._warning(" ".join(_warning_msg))
+                self._warned_once["components"] = True
+            return
+
+        return self._card_component_store[self._default_editable_card].components


@tuulos : Something interesting is happening in this case where we may need to change the behavior. The current behavior of cards is that current.card.append will result in warnings when you try to add components to a card that doesn't exist or when current.card cannot figure out the default editable card.

Now that we are supporting runtime updates, we need to have a clear story of how we deal with this scenario:

# `current.card` cannot disambiguate the default editable card current.card.append(componentA, id="a") # this will throw a warning. current.card.components["a"].update(...) # This will crash since currently we are not returning anything componentA.update(...) # this will work, but nothing will change on the UI

The main point is that if append and extend only throw warnings, it will also affect how the code behaves when update is called. Should we now throw exceptions when users try to append to cards that cannot be disambiguated?

the main case we want to handle is card updates hidden deep inside a library, referring to a card that you forgot to add --with card. To continue the policy that cards are best-effort and will never crash the user code, we could handle even

current.card.components["a"].update(...)

leniently and make a non-existing ID return a stub object that just shows a warning. I don't think this is a common case but being lenient sometimes and not other times is confusing too. We could just be always lenient - just show warnings, never crash

metaflow/plugins/cards/component_serializer.py

valayDave · 2023-10-19T21:49:47Z

metaflow/plugins/cards/card_cli.py

+                rendered_info = CardRenderInfo(
+                    mode=mode,
+                    is_implemented=True,
+                    data=None,
+                    timed_out=False,
+                    timeout_stack_trace=None,
+                )


Bug discovered because of Tests.

valayDave · 2023-10-19T23:45:57Z

Tests are failing rn because they are timing out get_card method called in the testing code; We are calling get_card to verify that cards are getting created during runtime. This happens when we run all tests at the same time and not when we individually run only the card tests in the same gh action runner. Will keep posted once I find a fix.

valayDave · 2023-10-31T19:41:34Z

test_runner

+# We run realtime cards tests separately because there these tests validate the asynchronous updates to the
+# information stored in the datastore. So if there are other processes starving resources then these tests will
+# surely fail since a lot of checks have timeouts. 
+run_runtime_card_tests() {
+  CARD_GRAPHS="small-foreach,small-parallel,nested-branches,single-linear-step,simple-foreach"
+  cd test/core && PYTHONPATH=`pwd`/../../ python3 run_tests.py --num-parallel 8 --contexts python3-all-local-cards-realtime --graphs $CARD_GRAPHS && cd ../../
+}
+
+install_deps && install_extensions && run_tests && run_runtime_card_tests


These are the changes from the latest commit I pushed. This will ensure that the realtime cards tests run after the normal metaflow test suite and are not included as a part of the test suite. The card tests will fail if there are too many processes; When too many processes are running it starves the metaflow's card-update processes of resources. Due to starvation of resources metaflow's card-update processes are unable to ship card/data updates and the test time's out. When ran in the happy case(i.e. no resource starvation) tests test succeed.

Nothing else has been changed in the latest rebase.

romain-intel

i have a few high level concerns and a bunch of small comments. At a high level:

I think we can clean up the render/render_runtime methods a bit and make it clearer around what refreshable cards need.
I am not sure if we wouldn't need "section" in the data file given the breaking up of a card into components
I have some concerns around the launching of subprocesses and the possibility of having run-away ones.

romain-intel · 2023-10-26T05:59:49Z

metaflow/plugins/cards/card_modules/card.py

@@ -68,8 +94,43 @@ def render(self, task) -> str:
        """
        return NotImplementedError()

+    # FIXME document


Yes, please document. I've been trying to add docstrings and all that so please document according to that. In particular, metaflow.Task (and not Task) and Data can be Optional[Any] I suppose.

cc @tuulos ;

makes sense - render takes a metaflow.Task always

romain-intel · 2023-10-26T06:00:44Z

metaflow/plugins/cards/card_modules/card.py

+    @component_id.setter
+    def component_id(self, value):
+        if not isinstance(value, str):
+            raise TypeError("id must be a string")


nit: Component ID must be a string.

romain-intel · 2023-10-26T06:02:14Z

metaflow/plugins/cards/card_cli.py

-    rendered_info = None
+def update_card(mf_card, mode, task, data, timeout_value=None):
+    """
+    This method will be resposible for returning creating a card/data-update based on the `mode` passed.


nit: responsible.

Also "returning creating"?

metaflow/plugins/cards/card_cli.py

romain-intel · 2023-11-01T23:34:18Z

metaflow/plugins/cards/component_serializer.py

+    It exposes the `append` /`extend` methods like an array to add components.
+    It also exposes the `__getitem__`/`__setitem__` methods like a dictionary to access the components by thier Ids.
+
+    The reason this has dual behavior is because components cannot be stored entirely as a map/dictionary because


This hasn't changed yet right?

metaflow/plugins/cards/component_serializer.py

romain-intel · 2023-11-01T23:38:43Z

metaflow/plugins/cards/component_serializer.py

+            or self._last_layout_change is None
+        )
+
+        if force or last_rendered_before_minimum_interval or layout_has_changed:


nit: can proably bypass stuff if force is true :)

can you expand on that? I didn't quite get what meant

I think I menat that the part before this can probably be skipped since we know we are going to force it.

metaflow/plugins/cards/component_serializer.py

metaflow/plugins/cards/card_creator.py

metaflow/plugins/cards/card_datastore.py

metaflow/plugins/cards/card_decorator.py

valayDave · 2023-11-03T17:11:33Z

metaflow/plugins/cards/card_modules/card.py

@@ -68,8 +94,43 @@ def render(self, task) -> str:
        """
        return NotImplementedError()

+    # FIXME document


cc @tuulos ;

valayDave · 2023-11-03T17:14:44Z

metaflow/plugins/cards/component_serializer.py

+    It exposes the `append` /`extend` methods like an array to add components.
+    It also exposes the `__getitem__`/`__setitem__` methods like a dictionary to access the components by thier Ids.
+
+    The reason this has dual behavior is because components cannot be stored entirely as a map/dictionary because


This hasn't changed; I think changing this will result in breaking functionality for anything < py37; Currently cards can work < py37

metaflow/plugins/cards/component_serializer.py

valayDave · 2023-11-03T17:17:43Z

metaflow/plugins/cards/component_serializer.py

+            or self._last_layout_change is None
+        )
+
+        if force or last_rendered_before_minimum_interval or layout_has_changed:


can you expand on that? I didn't quite get what meant

metaflow/plugins/cards/component_serializer.py

- added interface to MetaflowCardComponent for making it REALTIME_UPDATABLE - add realtime component rendering capabiity. - Changed ville's abstraction of CardComponents to CardComponentManager - introduced the `current.card.components` / `current.card['abc'].components` interface - `current.card.components` interface helps access/remove components

- abstraction to handle card creation for decorator (and in future outside code) - Create CardProcessManager class that helps manage the card processes running - Remove `_card_proc` logic from code. - fix card-refresh-bug caused by cardproc instance-method refactor - the `_card_proc` method is an instance method of a card decorator which is passed to componentCollector. This is done so that ComponentCollector can call the method when a refresh is called for an individual card. - Since there is only one copy of the ComponentCollector it created an issue when other cards were trying to call refresh (since ComponentCollector was instantiated with a **single card decorator's** `_card_proc`) - This commit refactored the code to handle multiple cards calling refresh

1. `card_cli.py` - Refactored the `update_card` method to handle many different cases gracefully. - The logic of the function is the same, it just gracefully handles the cases where `render_runtime`/`refresh` are not implemented - The main logic responsible rendering error cards now robustly handles cases of timeout and missing implementation of the `render_runtime` method. - Added more comments that help with the readability of the code. 2. `card.py` - rename `MetaflowCard.IS_RUNTIME_CARD` to `MetaflowCard.RUNTIME_UPDATABLE` - rename `MetaflowCardComponent.id` to `MetaflowCardComponent.component_id` 3. `card_creator.py`: - fix bug for the edge case were async sidecars handling card creation may not have completed execution. 4. `card_decorator.py`: - Refactor based on the rename of `MetaflowCard.IS_RUNTIME_CARD` to `MetaflowCard.RUNTIME_UPDATABLE` 5. `component_serializer.py`: - Refactor based on the rename of `MetaflowCardComponent.id` to `MetaflowCardComponent.component_id` - Remove the `FIXME` in the `__setitem__` function to not allow overwriting to card components via `current.card.components[ID] = NewComponent` - Add functionality to the `current.card.refresh` method that allows re-rendering cards when the layout of the card has changed. This mean that when new components are added to a card or when components are removed from a card then a re-render should be triggered. Modified the `ComponentStore` to expose a property that tells when the component array was last modified. The `CardComponentManager` uses this property in the `ComponentStore` to determine if a re-render should be triggered. - Add checks to ensure that objects passed to `current.card.refresh(data)` are JSON serializable. If these are not JSON serializable then warn the user that they are not JSON serializable. - introduced a `component_update_ts` filed in the return value of `current.card._get_latest_data` method. - Added comments to `current.card._get_lastest_data` for readability. 6. `exception.py` - Introduce a new exception for when users try to over-write components.

- Added methods to card datastore to read data updates - Add a private method in card client that can help get the latest data. - Refactored methods in the code to explictly provide all arguments to disambiguate between data reads and HTML reads.

- component-id setter - safe-access components that don't exist - warnings for `update` methods of un-implemented UserComponents - warnings when components don't exist

… processes - There is a case where we could have a async process and another non async process run in parallel. This is something we are not okay with because the async one(render_runtime) can override the values of the sync process. - This commit fixes this race-condition. - When we call the final render+refresh, we will wait for any async process to finish complete execution and ensure that both methods/processes complete execution.

…origin_pathspec in CardContainer

- Introduced a `runtime_data` property that has access to the final data object when `render` is called. - Remove the usage of `inspect`

romain-intel

OK. I looked at the diff you sent me and quickly at this. I think it looks ok so we should be good for an initial merge. I haven't extensively played with it too much but I trust you ahve :).

valayDave marked this pull request as ready for review September 26, 2023 22:47

valayDave commented Sep 26, 2023

View reviewed changes

valayDave requested review from tuulos, romain-intel and saikonen September 26, 2023 23:00

romain-intel reviewed Oct 3, 2023

View reviewed changes

valayDave force-pushed the valay/rtc-stack-0 branch 4 times, most recently from 21d21fb to fd1502d Compare October 18, 2023 00:44

valayDave commented Oct 18, 2023

View reviewed changes

metaflow/plugins/cards/component_serializer.py Outdated Show resolved Hide resolved

valayDave commented Oct 18, 2023

View reviewed changes

metaflow/plugins/cards/component_serializer.py Outdated Show resolved Hide resolved

valayDave commented Oct 18, 2023

View reviewed changes

metaflow/plugins/cards/component_serializer.py Outdated Show resolved Hide resolved

valayDave force-pushed the valay/rtc-stack-0 branch 3 times, most recently from 6890054 to ec58dd0 Compare October 19, 2023 21:48

valayDave commented Oct 19, 2023

View reviewed changes

valayDave force-pushed the valay/rtc-stack-0 branch from ec58dd0 to 83d7347 Compare October 31, 2023 19:36

valayDave commented Oct 31, 2023

View reviewed changes

romain-intel reviewed Nov 1, 2023

View reviewed changes

valayDave commented Nov 3, 2023

View reviewed changes

tuulos and others added 7 commits November 20, 2023 23:03

[current.card.refresh] refactor cards to support runtime updates

a2adc21

[card-refresh] implement card reload policy

54e637f

[card][tests] verify the behavior of .refresh / .components

2bfe12f

valayDave added 11 commits November 20, 2023 23:03

[card-components][core-change][MET-113]

93137fe

- component-id setter - safe-access components that don't exist - warnings for `update` methods of un-implemented UserComponents - warnings when components don't exist

[card-decorator][MET-103] refresh-interval

7653366

[card-client][MET-106] Card.get_data as public fn

6edba98

[card-cli][MET-105] Simplify timeout logic

2217d3d

[card-core][chore][MET-111] Remove unsed imports

3afe942

[card-datastore][MET-108] simplify path creation logic

20532ce

[component-serializer][MET-108] simplify component storage logic

4e7f32d

[card-client][MET-117] remove from_resumed and rather derive it from …

d9c1437

…origin_pathspec in CardContainer

[cleanup][MET-107] typo / hygiene

0005550

[card-core][MET-104] revert MetaflowCard.render signature to original

a719a43

- Introduced a `runtime_data` property that has access to the final data object when `render` is called. - Remove the usage of `inspect`

valayDave force-pushed the valay/rtc-stack-0 branch from 993d6f0 to a719a43 Compare November 21, 2023 07:03

saikonen mentioned this pull request Nov 28, 2023

[realtime-cards] backend API Methods Netflix/metaflow-service#402

Merged

romain-intel approved these changes Dec 1, 2023

View reviewed changes

romain-intel merged commit f8e4e6b into master Dec 5, 2023
22 checks passed

romain-intel deleted the valay/rtc-stack-0 branch December 5, 2023 09:26

[@card] Realtime cards (1/N) #1550

[@card] Realtime cards (1/N) #1550

Conversation

valayDave commented Sep 26, 2023 • edited Loading

Realtime Cards

PR Stack

Metaflow UI Changes PRs

Key Changes in the PR

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

romain-intel left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

valayDave left a comment

Choose a reason for hiding this comment

valayDave Oct 17, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

valayDave commented Oct 19, 2023 • edited Loading

Choose a reason for hiding this comment

romain-intel left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

romain-intel left a comment

Choose a reason for hiding this comment

valayDave commented Sep 26, 2023 •

edited

Loading

valayDave Oct 17, 2023 •

edited

Loading

valayDave commented Oct 19, 2023 •

edited

Loading