Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Runner IO: Annotating any Input fails #1238

Open
3 of 6 tasks
sam2x opened this issue Oct 14, 2024 · 5 comments
Open
3 of 6 tasks

Runner IO: Annotating any Input fails #1238

sam2x opened this issue Oct 14, 2024 · 5 comments
Assignees
Labels
type:bug A general bug

Comments

@sam2x
Copy link

sam2x commented Oct 14, 2024

Pre-bug-report checklist

1. This bug can be reproduced using pure Argo YAML

  • No

2. I have searched for existing issues

  • Yes

3. This bug occurs in Hera when...

  • exporting to YAML
  • submitting to Argo
  • running on Argo with the Hera runner
  • other:

Bug report

Describe the bug

Hello, I have an issue with IO function parameter using Runner constructor that fail when I annotate Input/Ouput object.

I spotted it with my code, and then reproduced it with a simple example. I need this approach cause:

  1. Only one function parameter can be specified when using an Input.
  2. I need to annotate information that differ from one function to another, with the same IO object. Else I have to create a new model just for annotation.

This works:

@script(constructor="runner")
def pydantic_io(my_input: MyInput) -> MyOutput

This doesnt works:

@script(constructor="runner")
def pydantic_io(my_input: Annotated[MyInput, {"bla":"whatever"}]) -> MyOutput

_Error log:

hera.exceptions.BadRequest: Server returned status code 400 with message: `templates.main.steps[1].pydantic-io templates.pydantic-io inputs.parameters.my-input was not supplied`

To Reproduce

from hera.workflows.io import Input, Output                                                                                                                   
from hera.workflows import Artifact, ArtifactLoader, Parameter, Steps, Workflow, script                                                                       
from hera.workflows.archive import NoneArchiveStrategy                                                                                                        
from prj.cfg.config import *                                                                                                                                
                                                                                                                                                              
from typing import Annotated                                                                                                                                  
from pydantic import BaseModel                                                                                                                                
                                                                                                                                                              
                                                                                                                                                              
class MyObject(BaseModel):                                                                                                                                    
    a_dict: dict  # not giving a default makes the field a required input for the template                                                                    
    a_str: str = "a default string"                                                                                                                           
                                                                                                                                                              
                                                                                                                                                              
class MyInput(Input):                                                                                                                                         
    param_int: Annotated[int, Parameter(name="param-input")] = 42                                                                                             
    an_object: Annotated[MyObject, Parameter(name="obj-input")] = MyObject(                                                                                   
        a_dict={"my-key": "a-value"}, a_str="hello world!"                                                                                                    
    )                                                                                                                                                         
    artifact_int: Annotated[int, Artifact(name="artifact-input", loader=ArtifactLoader.json)]                                                                 

class MyOutput(Output):
    param_int: Annotated[int, Parameter(name="param-output")]
    artifact_int: Annotated[int, Artifact(name="artifact-output")]


@script(constructor="runner")
def writer() -> Annotated[int, Artifact(name="int-artifact", archive=NoneArchiveStrategy())]:
    return 100

@script(constructor="runner")
def pydantic_io(
        my_input: Annotated[MyInput, Parameter(name="my-input")],
) -> MyOutput:
    return MyOutput(exit_code=1, result="Test!", param_int=42, artifact_int=my_input.param_int)

with Workflow(generate_name="pydantic-io-") as w:
    with Steps(name="main"):

        write_step = writer()
        pydantic_io(
            arguments=[
                write_step.get_artifact("int-artifact").with_name("artifact-input"),
                {
                    "param_int": 101,
                    "an_object": MyObject(a_dict={"my-new-key": "my-new-value"}),
                },
            ]
        )
    w.create()

Expected behavior

Being somehow able to annotate Input/Output object-based in my function definition. Or any alternative.

Environment

  • Hera Version: 5.17.1
  • Python Version: 3.11.6
  • Argo Version: 3.5.2
@sam2x sam2x changed the title Runner IO: Annotating any Input/Output fails Runner IO: Annotating any Input fails Oct 14, 2024
@elliotgunton
Copy link
Collaborator

Hey @sam2x! I'm unsure if there's a misunderstanding of the feature, or you want to do something different, if you could help clarify! 🙏

When you use Input as a function parameter, that is how you tell Hera to use all the fields of your subclass of Input as template parameters. When you use Parameter in an annotation, that is telling Hera that "this is a single template parameter".

Therefore

my_input: Annotated[MyInput, Parameter(name="my-input")],

doesn't really make sense - you want an input parameter called my-input, but it is also an Input class, so its fields should also be parameters.

If you want to use metadata in Annotated which is unrelated to Hera (i.e. for your own purposes), that should also work

my_input: Annotated[MyInput, "some metadata"],

From your description

  1. Only one function parameter can be specified when using an Input.
  2. I need to annotate information that differ from one function to another, with the same IO object. Else I have to create a new model just for annotation.

I'm not sure what you mean by "annotate information"? Are the parameters different? Or are the defaults different for each function but using the same parameters? etc...

Please let me know so I can try to help! 🚀

@elliotgunton elliotgunton added the note:needs-info We need more info from the issue submitter label Oct 15, 2024
@sam2x
Copy link
Author

sam2x commented Oct 15, 2024

Hello @elliotgunton, thanks for your answer.

Therefore

my_input: Annotated[MyInput, Parameter(name="my-input")],

doesn't really make sense - you want an input parameter called my-input, but it is also an Input class, so its fields should also be parameters.

You are right, I tried in despair different approach, but the reuse of "Parameter" inside an Input was the worst example.

If you want to use metadata in Annotated which is unrelated to Hera (i.e. for your own purposes), that should also work
my_input: Annotated[MyInput, "some metadata"],

That's exactly the issue, the sample won't work even with your example, the hera "signature" parser doesn't seems to detect MyInput when it is specified in this way. Based on your statement, and updating my bug sample, if you try to change from this working definition:

def pydantic_io(my_input: MyInput) -> MyOutput

To this:

def pydantic_io(my_input: Annotated[MyInput, "some metadata"]) -> MyOutput

This would raise the following exception on my side:

hera.exceptions.BadRequest: Server returned status code 400 with message: `templates.main.steps[1].pydantic-io templates.pydantic-io inputs.parameters.my_input was not supplied`

Hope i didn't misinterpret your explanation, and my comment is relevant to you.
Br,
Sam

@elliotgunton
Copy link
Collaborator

That's perfect, thank you! All clear from my side, that is indeed a bug/feature we need to fix/add 🚀

@sam2x
Copy link
Author

sam2x commented Oct 15, 2024

A bit more hints for the one handling the issue. The bug is located in "_map_argo_inputs_to_function".
The function doesnt seems to unwrap the Annotated types before making checks of the type.

for func_param_name, func_param in inspect.signature(function).parameters.items():

Need something similar to unwrap/check via "get_origin" check first (first 3 lines of my code below) and InputV1/InputV2 issubclass check against annotation (last 2 lines):

[...]
    for func_param_name, func_param in inspect.signature(function).parameters.items():
        **annotation = func_param.annotation
        if get_origin(annotation) is Annotated:
            annotation, *metadata = get_args(annotation)**

        if param_or_artifact := get_workflow_annotation(annotation):
            if isinstance(param_or_artifact, Parameter):
                mapped_kwargs[func_param_name] = get_annotated_param_value(func_param_name, param_or_artifact, kwargs)
            else:
                mapped_kwargs[func_param_name] = get_annotated_artifact_value(param_or_artifact)

        # InputV1 or InputV2 subclass ? 
        elif not is_subscripted(annotation) and issubclass(annotation, (InputV1, InputV2)):
            mapped_kwargs[func_param_name] = map_runner_input(annotation, kwargs)
        [...]

I haven't the right env to properly patch/check for break and test (not sure also if it's interoperable with previous version), but seems to be isolated to this function.

Hope this help tough the dev.
Br,
SM.

@alicederyn alicederyn self-assigned this Oct 16, 2024
@alicederyn alicederyn added type:bug A general bug and removed note:needs-info We need more info from the issue submitter labels Oct 16, 2024
@sam2x
Copy link
Author

sam2x commented Oct 16, 2024

Hey again,

I have been digging a bit in the source code. I may have make a mistake on the origin of the issue. This may also come from:
def _get_inputs_from_callable(source: Callable) -> Tuple[List[Parameter], List[Artifact]]:

I would appreciate your feedback at this discussion, those topics seems to be directly link to the same function :
#1241 (comment)

Let me know how I can help also.
Br,
Sam

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type:bug A general bug
Projects
None yet
Development

No branches or pull requests

3 participants