Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add binary payload handling to PyLECO #82

Merged
merged 22 commits into from
Jun 19, 2024
Merged

Add binary payload handling to PyLECO #82

merged 22 commits into from
Jun 19, 2024

Conversation

BenediktBurger
Copy link
Member

@BenediktBurger BenediktBurger commented May 22, 2024

Implements the transport of binary data in additional frames, related to How to transmit binary data (pymeasure/leco-protocol#65).

Idea:

  • First payload frame is JSON
  • Second (and more) frame are binary

Implementation ideas.

  • MessageHandler offers the whole message for the rpc handler and adds the additional frames afterwards
  • Communicator can return the bytes
  • Communicator (and Director) can send bytes in a rpc request

Open questions:

  • Should a message with binary data contain a different message type?
  • If not, how to know, whether the binary data is the desired response instead of the json response?
    A return value of None with additional payload frames is a strong indicator, but not exclusive to this scenario. Is it sufficient?
  • The additional_payload of the Message: what to do, if there is no data defined: should there be an empty first data frame before the additional, or is the additional_payload the whole payload?

@BenediktBurger BenediktBurger added the enhancement New feature or request label May 22, 2024
Copy link

codecov bot commented May 22, 2024

Codecov Report

Attention: Patch coverage is 98.79518% with 1 line in your changes missing coverage. Please review.

Project coverage is 89.52%. Comparing base (cd0bf05) to head (6ba4c19).
Report is 1 commits behind head on main.

Files Patch % Lines
pyleco/utils/listener.py 80.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main      #82      +/-   ##
==========================================
+ Coverage   89.04%   89.52%   +0.48%     
==========================================
  Files          36       36              
  Lines        2902     2931      +29     
  Branches      355      361       +6     
==========================================
+ Hits         2584     2624      +40     
+ Misses        267      256      -11     
  Partials       51       51              
Flag Coverage Δ
unittests 89.52% <98.79%> (+0.48%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@BenediktBurger
Copy link
Member Author

@seb5g, in this PR I experiment with the transport of binary data via PyLECO.
As it is a feature you preferred over the base64 encoding, I want to make you aware.

@BenediktBurger
Copy link
Member Author

Idea for:

If not, how to know, whether the binary data is the desired response instead of the json response?
A return value of None with additional payload frames is a strong indicator, but not exclusive to this scenario. Is it sufficient?

Just add a parameter, whether to return a binary value.

@BenediktBurger
Copy link
Member Author

@untzag, @ksunden, @VigneshVSV this could be also of interest for your own projects.

@BenediktBurger
Copy link
Member Author

I ask for feedback regarding the current implementation:

This is how you call the method some_funny_method with additional binary payload (for the method):

# on the remote side (on MessageHandler):
def some_funny_method(self):
    additional_payload = self.current_message.payload[1:]
    # do something with additional_payload
    self.additional_response_payload = [b"whatever", b"to return", b"binary"]
    return None

# on the requester side:
ask("receiver", method="some_funny_method", additional_payload=[b"123", b"456], extract_additional_payload=True) 
# returns: [b"whatever", b"to return", b"binary"]

The parameter extract_additional_payload (independent of the additional payload on the way to the remote contact) will return the binary return value of some_funny_method, if the JSON response value is None.

@VigneshVSV
Copy link

VigneshVSV commented May 28, 2024

I think its a tricky question in general as seen by some questions you raised in the issue.

Currently I allow either returning either of non-serialized data or serialized data only. Not both. Technically a user could do

def my_server_side_method_returning_bytes(self):
    return memoryview(self.my_large_numpy_array)

def my_server_side_method_normal(self):
    return 5, 'my name', self.foo_list

i.e. for the user, the usage of methods is the same and there is no need to access an additional request or response object. Although I dont see what is wrong with that either. Some HTTP servers which work based on decorating functions, like flask, allow to access the request object through some global variable. Same can be then done for response.

Nevertheless, since I allow only one type (either of bytes, bytearray or memoryview) or of a other python types which needs to be serialized, its a little easier for me to place them in different indices of the multipart message to know what was the reply. In this way, the client just checks which part of the multipart message containing return values has size greater than 0. The user given pre serialized part or the RPC server serialized part?

Of course, the ideal and best case is to have both so that the user can return any damn return value. This is the right approach. May be its best to reserve a tuple of size 2 as a fixed internal return value scheme. Just before serializing, the type can be checked to be tuple and if its of length 2, then one can proceed to inspect the types each element of the element. If one turns out to be byte type, then dont serialize it again as that will lead to serilization error:

def my_mixed_returning_server_method(self):
    return { 1: self.some_foo_list, 3 : 42 }, memoryview(self.my_large_numpy_array)

Within the RPC server:

    ret = func(*args, **kwargs)
    if isinstance(ret, tuple) and len(ret) == 2: # return value containing one normal part and one byte part
          if isinstance(ret[1], (bytes, bytearray, memoryview)): # check if its really a byte part
                 self.serializer.dumps(ret[0]) # ret[1] is already serialitzed
          else:
                self.serializer.dumps(ret)  # serialize normal data given in tuple of length 2 
          # as the return value was indeed a normal tuple
   elif not isinstance(ret, (bytes, bytearray, memoryview)): # covers the case of prevent serialization of single 
        # return value which is of bytes type
       self.serializer.dumps(ret) # some other data including tuples of lower or higher length

Of course the dumps return value must be assign to a place within the multipart message.

This is the way I would do it.

@BenediktBurger
Copy link
Member Author

Thanks @VigneshVSV for your extensive answer.
Part of the challenge is the currently used rpc server. The server sees only the json request and returns a json response. I have to handle the binary things outside (with these external variables).

@BenediktBurger
Copy link
Member Author

I have an idea (inspired by you): If the json frame is empty, the other frames are the return value.

@BenediktBurger
Copy link
Member Author

Newest design:

  • in ask, the extract_additional_payload parameter decides on the return value: Either the JSON return value or a tuple of the JSON return value and the additional payload, which might be an empty list. Therefore it is always clearly defined, what it returns
  • The message handler offers a possibility to register a binary method:
    • If a parameter is set, the method will receive the request's additional payload as a kwarg.
    • It's return values will be filtered for binary data, which will be transmitted as additional payload.

@VigneshVSV
Copy link

Regarding the ask RPC caller, I would still keep in mind that the server side method works transparently irrespective of whether it is called within another function or method, or from client side with the ask. I am not sure if the change you made affects this.

@BenediktBurger
Copy link
Member Author

@VigneshVSV , thanks for your comments, they are helpful.
Registering a method makes it available via RPC. Therefore, you have to regsiter it, before you can call it via ask.

@BenediktBurger BenediktBurger marked this pull request as ready for review June 1, 2024 10:59
@BenediktBurger
Copy link
Member Author

A small change:

  • You have to choose, whether the method will return only JSON values or a tuple of a JSON value and a list of bytes.

Currently the you have to tell the ask method, whether to expect binary data or not. I do not consider this as a major problem:

  • You know which method you called (and binary methods have a "(binary)" at the end of their docstring), so you know, what to expect.
  • In doubt, you can ask for the binary_code and mostly use the first (JSON) response value. Here it comes handy, that you know the return value (JSON or tuple of JSON and list) depending on the parameter.
  • Later on, we still can add a special MessageType, which indicates a bytes payload.

@seb5g
Copy link

seb5g commented Jun 4, 2024

I'm not using really pyleco at the moment as you are the one who implemented it in pymodaq... so I still have difficulties to really understand how all this works. But for sure, everything within pymodaq can be binary serialized so direct emission of such data (frame as you call it?) would be better. I'm not sure how i could really help in the design...

@BenediktBurger
Copy link
Member Author

Thanks for your comment @seb5g. I appreciate it.

@BenediktBurger
Copy link
Member Author

Summary:

  • It is easier to add more payload frames (list of bytes) with the additional_payload parameter of Message, DataMessage classes, and ask_rpc methods
  • interpret_rpc_result can either return just the content of the first payload frame (normally json) or that content plus the list of additional frames (maybe an empty list), depending on a parameter switch. Also ask_rpc allows to use that parameter. That allows to retrieve the additional frames easily
  • Finally, the MessageHandler offers to register binary methods which can accept and/or return also binary values

@BenediktBurger BenediktBurger merged commit ca94ef5 into main Jun 19, 2024
20 checks passed
@BenediktBurger BenediktBurger deleted the binary_payload branch June 19, 2024 10:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants