Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for encoding/decoding external structs #200

Open
davydog187 opened this issue Apr 9, 2019 · 32 comments
Open

Support for encoding/decoding external structs #200

davydog187 opened this issue Apr 9, 2019 · 32 comments
Assignees

Comments

@davydog187
Copy link
Member

First of all, thank you for all the hard work on rustler!

We have been making extensive use of Rustler as a means to run simulations and cpu bound computations from Elixir.

We keep running into the problem of wanting to write Rust only libraries, then import them from an outer crate that exposes the library to Elixir. The issue is that Rustler only supports compile time encoding/decoding, which means that we either need to write an adapter and duplicate our data structures, or find some other means.

It would be wonderful if Rustler provided a mechanism to Encode/Decode external data structures so that Rustler doesn't leak into our lower level dependencies.

We are very interested in writing this feature for Rustler, but I first wanted to see if there has been any thought on this front in the past.

Thank you!

@evnu
Copy link
Member

evnu commented Apr 9, 2019

I spent some time thinking about this as well, but did not yet arrive at a good solution yet. We use feature flags to work around that (disable compiling proc_macro-generated code if the crate is not to be used within a NIF), or by copying struct definitions. I wonder if serde support (#74) would help here. If rustler was able to encode/decode from serde's interface, annotating a struct in a rust-only crate with #[derive(Serialize, Deserialize)] could possibly already work.

EDIT: serde-transcode could be useful to avoid serializing into an intermediate format here.

@scrogson
Copy link
Member

scrogson commented Apr 9, 2019

@davydog187, @evnu can one of you provide a little bit more detail into what your expectations are for this?

Is this so that you don't have to wrap external types in a new type?

@evnu
Copy link
Member

evnu commented Apr 9, 2019

Is this so that you don't have to wrap external types in a new type?

Not exactly. I sometimes encounter the following scenario. Imagine a compute crate, which exposes a struct A. A should be usable by a command line application, and it should be exposable to Elixir. In pseudo-code:

// crate compute
pub mod compute {
    #[derive(Debug)]
    pub struct A {
        a: i32,
    }

    impl A {
        pub fn new(a: i32) -> Self {
            Self { a: a }
        }
    }
}

// crate compute_nif
pub mod compute_nif {
    pub fn compute_an_a<'a>(env: Env<'a>) -> NifResult {
        let a = compute::A::new(10);

        a.encode(env);
    }
}

// crate compute_cli
pub mod compute_cli {
    fn main() {
        let a = compute::A::new(10);

        println!("a: {}", a);
    }
}

So, it would be nice if compute_nif could make use of A to expose it with as little extra effort as possible. Wrapping it in a newtype would be no problem, of course. I believe though that this conflicts with Rust's orphan rules (see the hoops one has to jump through with serde when implementing derive for a remote type).

@davydog187 I hope that example also covers what you are asking for, I did not intend to hijack this issue.

EDIT: I forgot to mention: If the remote type is wrapped in a newtype, the remote type still needs to implement Encode/Decode.

@davydog187
Copy link
Member Author

Let's image than I have crate_elixir and crate_rust. crate_rust doesn't want to have rustler as a dependency, but it has some data structure that we may way to ultimately use in crate_elixir

/// crate_rust
pub struct A {
  val: String
}

Now, if I want to use A in crate_elixir, I would have to re-encode A to a new type that has #[derive(NifStruct)], which potentially adds a lot of overhead. It would be awesome if there were a runtime way of encoding/decoding structs. Or even some trait that could be implemented

@davydog187
Copy link
Member Author

@evnu I think we're on the exact same page here 😆

@davydog187
Copy link
Member Author

This also brings up another issue, as far as I can tell, tuple structs and newtypes are not encodeable/decodeable

@scrogson
Copy link
Member

scrogson commented Apr 9, 2019

I see. Yeah, rust's orphan rules make this pretty interesting. I'm not sure exactly how to go about supporting this. But if serde was able to get around it, maybe we can as well...

@hansihe
Copy link
Member

hansihe commented Apr 9, 2019

Ah, I was not aware serde was doing it that way. Don't really like it, but it solves the problem at least.

So as far as I can see there are two alternatives here:

  1. Support encoding and decoding with serdes traits
  2. Implement the same mechanism as serde does for remote structs
  3. Both?

Any opinions?

@scrogson
Copy link
Member

scrogson commented Apr 9, 2019

It seems like if we supported encoding/decoding with serde's traits, we would still have the issue of not all types deriving Deserialize and Serialize.

Does that sound right?

If so, we would probably need to support both?

@hansihe
Copy link
Member

hansihe commented Apr 9, 2019

If we supported the serde traits, it would enable the user to use serde's mechanism for remote structs.

But again, that's not completely optimal either. We should probably support something similar to what they are doing in our macros.

@evnu
Copy link
Member

evnu commented Apr 10, 2019

But again, that's not completely optimal either. We should probably support something similar to what they are doing in our macros.

Do have something in mind for that? As far as I understand, it is not possible to retrieve the AST for a remote type. Then, the remote struct definition must be explicitly copied which again results in code duplication.

@hansihe
Copy link
Member

hansihe commented Apr 10, 2019

@evnu I would love to avoid having to do that, but as far as I understand there really isn't any alternative? I suspect serde would have used a better solution if it existed

@elbow-jason
Copy link
Contributor

elbow-jason commented Apr 17, 2019

Is the idea here to take a Rust struct and serialize it into erlang term format. Then use :erlang.binary_to_term/1 to decode the Erlang runtime struct? or to define a struct in Rust and have the Rust struct's definition end up in Elixir? or both? or neither?

@hansihe
Copy link
Member

hansihe commented Apr 17, 2019

@elbow-jason The idea was to encode/decode the serde traits directly from the in memory term representation and back, much like we do currently with our own encoder/decoder traits.

This should hopefully have less of an overhead than going through the binary format.

@sunny-g
Copy link
Contributor

sunny-g commented Jun 9, 2019

@hansihe wrote a library pertaining to the issue: serde_rustler provides a Serializer and Deserializer for directly encoding/decoding Rust types to/from Elixir terms.

Encoding and decoding benchmarks look promising (like, insanely, unbelievably promising), and given that I haven't published many Rust crates or Elixir packages, I'd greatly appreciate feedback on reconfiguring the benchmarks, improving the tests, changing the serialization/deserialization behavior, or just adding to/improving the API.

Hope y'all enjoy!

@hansihe
Copy link
Member

hansihe commented Jun 9, 2019

@sunny-g Great work!

Seeing as this would probably be useful for quite a few people I think it would make sense to at last add it to the Readme of rustler?

I looked through it briefly, and it looked pretty good from what I could tell. I'll hopefully find some time to look through it more thoroughly soon.

@sunny-g
Copy link
Contributor

sunny-g commented Jun 9, 2019

@hansihe Awesome, I'll open a PR. I'm kinda using the tests package as a playpen for testing, benchmarking and more advanced rustler usage, so if you see something amiss, please let me know!

@davydog187
Copy link
Member Author

@sunny-g I stumbled across your repo the other day. Great work! Should this be folded into rustler or should it stay as a separate library?

My worry is that the fragmentation makes it less accessible

@sunny-g
Copy link
Contributor

sunny-g commented Jun 9, 2019

PR to add link to readme here: #224.

@davydog187 I'm not opposed to it, but I'm still in the process of debugging some performance issues and experimenting with additional features (e.g. a Value type similar to serde_json::Value for operating on raw Elixir terms, and a dedicated library for serialization between Elixir and existing serde formats). I'd reconsider once some of these things have stabilized/are finalized.

@evnu
Copy link
Member

evnu commented Oct 18, 2019

@sunny-g were you able to make progress?

@scrogson
Copy link
Member

Just to chime in here. I’d love to see serde_rustler pulled into the rustler repo/workspace. I used it recently on https://github.com/scrogson/no-way-jose and I was thoroughly impressed.

@sunny-g excellent work!

@sunny-g
Copy link
Contributor

sunny-g commented Oct 21, 2019

@evnu never got around to it 😦

@scrogson Appreciate that! I've been coming around to this idea as well - To start, I think it could be done with as little as an impl Serializer for Env and impl Deserializer for Term tucked behind a serde feature flag, and might also be worth augmenting/replacing the existing NifStruct-type attribute/macros with ones that provide shorthands for module struct/record naming and derive Serialize/Deserialize - what do you think?

@sunny-g
Copy link
Contributor

sunny-g commented Mar 5, 2020

@davydog187 @hansihe I'm not really using Elixir much anymore (my foray into serde_rustler got me hooked on Rust 🤷‍♂️) and the crate is beginning to accumulate issues and PRs so I'm now very willing to deprecate my library and fold it's logic into rustler.

If someone wants to take this task, I think the easiest and most simplifying (though breaking) changes would be what I mentioned in my previous comment here, i.e. impl Serializer for Env and impl Deserializer for Term, re-using logic from serde_rustler, and optionally:

  1. getting rid of Encoder and Decoder while merging their logic into the Serializer and Deserializer and
  2. deprecating the NifStruct et al macros, instead pushing users to use serde's container, field and attribute macros.

If / however y'all want to do this is fine with me, and I can help anyone who wants to pick up the task.

@scrogson
Copy link
Member

scrogson commented Mar 9, 2020

I started integrating this into rustler over the weekend.

@scrogson scrogson self-assigned this Mar 9, 2020
@joshuataylor
Copy link

Howdy! I've ended up updating serde_rustler using my branch to point to the latest versions of Rustler. Are there still plans to roll this in here (as a feature flag, I'm guessing)?

I'm writing a library using rustler for req_snowflake, and Snowflake supports two different response types: JSON & Arrow. I'm using arrow2 Rust library for parsing these Arrow IPC Streams (they're files Snowflkae sends back as gzipped), and I'm honestly blown away by how fast the NIF is. Initial benchmarks show promising results, and we can even create Elixir types from Rust as we know the types from Arrow.

I've got little experience with Rust, just having used it to create a few small CLI apps here and there, so would love feedback on when the library is ready.

@aj-foster
Copy link
Contributor

Hi @scrogson and friends 👋🏼 I was recently in contact with @sunny-g and we briefly discussed the future of serde_rustler. Wanted to check in on the possibility of integrating it completely (if that's still on the table) or otherwise providing support to update the library.

The future is in your hands: if you still believe that integration is the best path forward, please let me know how we can help. If another path is better, let's see what we can do. 👍🏼

@evnu
Copy link
Member

evnu commented Oct 7, 2022

@aj-foster Integrating serde_rustler would be nice! Would you like to prepare a pull request for vendoring it in? Would that be ok with @sunny-g ?

@sunny-g
Copy link
Contributor

sunny-g commented Oct 7, 2022

@evnu @aj-foster works for me! Depending on the usage of the existing macros, I would still recommend what I mentioned here; otherwise @aj-foster and I can spearhead a simple PR to just merge serde_rustler into a rustler subdirectory, behind a feature flag, etc.

@aj-foster
Copy link
Contributor

Hi @evnu, I'm happy to prepare a pull request. I'm curious how you would like to approach this from an ergonomic perspective. We can take a relatively light approach (adding the package here, still segmented from the main code), or a more integrated approach like Sunny mentioned above.

Please feel free to spell out what you're imagining like I'm a small child — I'll need guidance throughout this process. 🙂

@evnu
Copy link
Member

evnu commented Oct 8, 2022

@aj-foster good question on the steps. Lets keep it simple for now and start with vendoring serde_rustler into the repository. That also means checking the links in serde_rustler to ensure that they will point to the correct documentation.

Afterwards, we will need to discuss how to integrate serde_rustler further. I am not sure if we want to get rid of Encoder/Decoder completely, as that would mean that we need to basically merge serde_rustler into rustler (IIUC), requiring users to dependent on serde even if they don't actually want to use serde. But maybe we can delay that decision for now, and tackle rustler_codegen as the second step: Can we replace that completely with serde_rustler (especially the new NifEnum decorator), and is it as efficient (zero-cost conversion) as rustler_codegen? If so, lets see how we can deprecate rustler_codegen.

So, I propose these steps:

  1. Vendor serde_rustler.
  2. Determine if we can replace all functionality of rustler_codegen. If so, deprecate rustler_codegen.
  3. Determine if we need to integrate parts of serde_rustler deeper into the rustler crate itself.

Does that make sense? CC @rusterlium/core

@scrogson
Copy link
Member

@evnu rustler_codegen is still required as it has all the macros for the init, nif, etc. macros.

@evnu
Copy link
Member

evnu commented Oct 12, 2022

@scrogson of course, I forgot about that part, thanks for pointing that out. So instead we would need to check if we can deprecate NifStruct etc.

@evnu evnu mentioned this issue Nov 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants