Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bug] For loops in nested pipelines have clashing component names #11484

Open
m-dz opened this issue Dec 23, 2024 · 2 comments
Open

[bug] For loops in nested pipelines have clashing component names #11484

m-dz opened this issue Dec 23, 2024 · 2 comments

Comments

@m-dz
Copy link

m-dz commented Dec 23, 2024

Environment

  • Python 3.10.13

  • KFP SDK version:
    2.11.0 (current on 23/12/2014)

Steps to reproduce

Simple example with:

  • 1 "main pipeline" having a loop which runs "pipeline sub"
  • "Pipeline sub" with 2 pipelines "aaa" and "bbb" with for loops each

What I think is happening when pipelines are being merged:

  1. Pipeline "aaa" has a for loop which is renamed to for-loop-1 (irrespective of its name in the pipeline definition)
  2. This aaa is merged to the "main" pipeline, which already has component specs for "for-loop-1", and now is getting new comp specs for "for-loop-1-2"
  3. Pipeline "bbb" has a for loop which is again renamed to for-loop-1 (again, pipeline definition name is ignored)
  4. "bbb" is merged to "main", which OVERWRITES comp specs for "for-loop-1-2"
  5. Now, we have the following components:
    • "comp-for-loop-1"
    • "comp-for-loop-1-2" with "bbb"'s task
    • "comp-pipeline-aaa" referring to "comp-for-loop-1-2" (incorrectly)
    • "comp-pipeline-bbb" referring to "comp-for-loop-1-2" again

Expected result

  1. Each for loop will have its own component
  2. Each for loop task will refer to a correct component

Expected behaviour can be obtained by making two relatively simple changes in the TasksGroup class:

  1. Adding self.name = name in https://github.com/kubeflow/pipelines/blob/master/sdk/python/kfp/dsl/tasks_group.py#L69 , which provides a name for the component to use
  2. Removing the _make_name_unique method (https://github.com/kubeflow/pipelines/blob/master/sdk/python/kfp/dsl/tasks_group.py#L87) from it, which stops the name to be overwritten

Materials and reference

The following 2 files reproduce the issue:

# pipelines/pipeline_main.py

from typing import List

from kfp.compiler import compiler
from kfp.dsl import pipeline, component, ParallelFor

from pipelines.pipeline_sub import pipeline_sub


@component(base_image='python:3.10')
def get_ids_for_main_loop() -> List[int]:
    return [1,2,3]


@pipeline(name="main")
def main_pipeline():
    get_ids_for_main_loop_task = get_ids_for_main_loop()

    # This is getting `comp-for-loop-1`
    main_loop = ParallelFor(
        items=get_ids_for_main_loop_task.output,
        name="main-loop-zzz",
    )
    with main_loop as _ignore:
        pipeline_sub()


if __name__ == '__main__':
    compiler.Compiler().compile(
        pipeline_func=main_pipeline,
        package_path='pipeline_main.yaml',
    )

# pipelines/pipeline_sub.py

from typing import List

from kfp.dsl import ParallelFor, component, pipeline


@component(base_image='python:3.10')
def get_ids_for_loop_aaa() -> List[int]:
    return [1,2,3]


@component(base_image='python:3.10')
def get_ids_for_loop_bbb() -> List[int]:
    return [7,8,9]


@component(base_image='python:3.10')
def print_ids_aaa(_id: int):
    print(f'Id aaa: {_id}')


@component(base_image='python:3.10')
def print_ids_bbb(_id: int):
    print(f'Id bbb: {_id}')


@pipeline(name="pipeline_aaa")
def pipeline_aaa():
    get_ids_for_loop_aaa_task = get_ids_for_loop_aaa()

    loop_aaa = ParallelFor(
        items=get_ids_for_loop_aaa_task.output,
        name="loop-aaa-zzz",
    )
    with loop_aaa as _id:
        print_ids_aaa(_id=_id)


@pipeline(name="pipeline_bbb")
def pipeline_bbb():
    get_ids_for_loop_bbb_task = get_ids_for_loop_bbb()

    loop_bbb = ParallelFor(
        items=get_ids_for_loop_bbb_task.output,
        name="loop-bbb-zzz",
    )
    with loop_bbb as _id:
        print_ids_bbb(_id=_id)


@pipeline(name="pipeline_sub")
def pipeline_sub():
    pipeline_aaa_task = pipeline_aaa()  # This is getting `comp-for-loop-1-2` when merged to main
    pipeline_bbb_task = pipeline_bbb()  # This is OVERWRITING `comp-for-loop-1-2` when merged to main...
    pipeline_bbb_task.after(pipeline_aaa_task)

Labels

/area components


Impacted by this bug? Give it a 👍.

@m-dz m-dz added the kind/bug label Dec 23, 2024
@m-dz m-dz changed the title [bug] <Bug Name> [bug] For loops in nested pipelines have clashing names Dec 23, 2024
@m-dz m-dz changed the title [bug] For loops in nested pipelines have clashing names [bug] For loops in nested pipelines have clashing component names Dec 23, 2024
@m-dz
Copy link
Author

m-dz commented Jan 6, 2025

To add more details, I'll link to the code below, but in short, at some point pipelines are being "merged" into one, big YAML (I think) file where each task and component from each of the sub-pipelines sits. So e.g. if we have the "main", then "pipeline_sub", at some point the two are merged together into this single file where every component "should" have an unique name etc.

It all sits in kfp/compiler/pipeline_spec_builder.py:

merge_deployment_spec_and_component_spec: https://github.com/kubeflow/pipelines/blob/master/sdk/python/kfp/compiler/pipeline_spec_builder.py#L1629
_merge_component_spec: https://github.com/kubeflow/pipelines/blob/master/sdk/python/kfp/compiler/pipeline_spec_builder.py#L1671

The latter builds the component renaming dict old_name_to_new_name, and only then renames the components, so we are potentially renaming different components to the same name, as illustrated above.

There is also _merge_deployment_spec, but that seems to be fine.

@m-dz
Copy link
Author

m-dz commented Jan 7, 2025

I just found a stale PR #11071 (I searched in issues, not PRs...) addressing this exact issue, but there seems to be a simpler solution by not enforcing the ParallelFor renaming to for-loop-X. The approach to enforce this seems strange, why not trust users to correctly (in a unique way) name their loops?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant