Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make alloc optional, defaulting to slice based API #29

Merged
merged 21 commits into from
Nov 30, 2020

Conversation

MathiasKoch
Copy link
Contributor

@MathiasKoch MathiasKoch commented May 6, 2020

Currently all Bytes/BytesMut are eliminated, but some simple slice tests are still needed.

Also we need to figure out how we want to setup tests.. We could just test with --all-features, but i think that would never catch no_std and no_alloc bugs.. On the other hand, testing without default features, wouldn't allow prop-tests and Bytes/BytesMut, so we probably need to do tests in two steps, or something like that?

All tests pass in the current configuration.

After this, alloc::{Vec, String} has to be addressed.

Fixes #27

…es/BytesMut are eliminated, but some simple slice tests are still needed. After this, alloc::{Vec, String} has to be addressed
@00imvj00 00imvj00 requested a review from vincentdephily May 6, 2020 08:53
Copy link
Collaborator

@vincentdephily vincentdephily left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good so far, glad the changes aren't too intrusive.

This needs updated (or additional) tests that decode into a stack-based buffer.

Copy link
Collaborator

@vincentdephily vincentdephily left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you fix the git comment ? Missing some key words, maybe shell expansion got you.

Copy link
Collaborator

@vincentdephily vincentdephily left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand this 2nd commit correctly, it expands just one unittest and fixes function signatures to take a mut buf everywhere ?

Were the signature changes required to make the unittest compile, or are they forward-planning for n upcoming change ?

Concerning the unittest, we should be much more systematic about it. I'd like to see most test switch to using &[u8] instead of BytesMut, and where behavior differ (when BytesMut reallocs, at least) two sets of tests.

@MathiasKoch
Copy link
Contributor Author

This needs updated (or additional) tests that decode into a stack-based buffer.

I will definitely add some tests and rewrite some of the existing, but this is still just a draft PR, in order to get some discussion on the implementation details going.

Can you fix the git comment ? Missing some key words, maybe shell expansion got you.

Ohh sure, i used backticks '`', which is apparently not allowed in git messages.. I will amend it.

If I understand this 2nd commit correctly, it expands just one unittest and fixes function signatures to take a mut buf everywhere ?

Were the signature changes required to make the unittest compile, or are they forward-planning for n upcoming change ?

The signature changes are not super necessary, but they are changed because it is the correct way of using an impl trait signature.. The &mut impl trait doesn't really make sense, and i think the only reason it works with Bytes/BytesMut, is because of their internal Derefimplementations. With almost all other types, it makes for a very awkward combination of ampersands and mut keywords. Also it doesn't enforce that the variable passed to the interface is marked as mut.

We still need to address how to rid the current alloc::String and Vec, without making the changes too intrusive..

@vincentdephily
Copy link
Collaborator

I will definitely add some tests and rewrite some of the existing, but this is still just a draft PR, in order to get some discussion on the implementation details going.

No worries. I just like to see the tests early for a change like this, helps me see that the API is still sound.

Ohh sure, i used backticks '`', which is apparently not allowed in git messages.. I will amend it.

Backticks are allowed and useful, they might just have been interpreted by your shell before sending the message to git. Either escape them, or use a text editor for your commit message.

The signature changes are not super necessary, but they are changed because it is the correct way of using an impl trait signature.. The &mut impl trait doesn't really make sense, and i think the only reason it works with Bytes/BytesMut, is because of their internal Derefimplementations. With almost all other types, it makes for a very awkward combination of ampersands and mut keywords. Also it doesn't enforce that the variable passed to the interface is marked as mut.

Ok, makes sense, thanks. Let's see how it pans out with the other changes done.

We still need to address how to rid the current alloc::String and Vec, without making the changes too intrusive..

In theory the Strings and Vec<u8>s should get converted to slices. This'll make lifetimes more cumbersome, but if decode() takes the buf by ref and returns a consumed usize (or moves the buf and returns two buf slices), it should work.

Note that I really don't want to break compatibility with tokio's tokio_util::codec::Framed API. At the moment we can simply

struct MqttCodec;
impl Decoder for MqttCodec {
    type Item = Packet;
    type Error = mqttrs::Error;
    fn decode(&mut self, src: &mut BytesMut) -> Result<Option<Self::Item>, Self::Error> {
        mqttrs::decode(src)
    }
}
impl Encoder<Packet> for MqttCodec {
    type Error = mqttrs::Error;
    fn encode(&mut self, item: Packet, dst: &mut BytesMut) -> Result<(), Self::Error> {
        mqttrs::encode(&item, dst)
    }
}

It's ok if the noalloc changes make this a little bit more complicated (we might even provide this boilerplate behind a feature flag), but it shouldn't make it less efficient.

There are also a handful of Vec<StringOrStruct> in the returned packets (for example {Un,}Subscribe.topics, and AFAIU MQTT5 will have a lot more of those) and I'm really not sure how to handle those. One option would be using something like a SmallVec that isn't allowed to spill to the heap (and therefore has a max capacity and may cause an error during decoding in noalloc mode), but it's not super satisfying (having to guess at the spill threshold, adding a dependency, adding a failure mode...).

… &mut [u8] and BytesMut, and add additional tests to check WriteZero when buffer is too small, and unable to allocate
@MathiasKoch
Copy link
Contributor Author

  • Fixed the backticks in the git message
  • Rewrote all the decoder tests to use &[u8], as these will never allocate anyways
  • Rewrote the encoder tests to test on both &mut [u8] and BytesMut
  • Added a test to verify the behaviour of Error::WriteZero in case the input buffer is too small in encode, and also verifies that alloc types (BytesMut), allocates and returns correctly.

I have not made any changes to the codec_tests, as i feel a bit unsure on the prop_test stuff..
Btw, is there a reason why decoder.rs contains tests as well, rather than just keeping them in decoder_test.rs?

Note that I really don't want to break compatibility with tokio's tokio_util::codec::Framed API. At the moment we can simply

Cool, i don't see how that becomes a problem.. I can add the trait impl behind a feature flag, and add a unit test to make sure we don't break this, if you would like?

I think we should be able to manage the input part of Vec and String with slices and lifetimes, but the returns are a bit harder..? Also that would mean different types for input and returns, which i think would be a bit sad..
I think this part is the harder part of allowing no_alloc, so any suggestions on ridding string and vec is greatly appreciated!

@MathiasKoch
Copy link
Contributor Author

MathiasKoch commented Jun 5, 2020

I am still battling a bit with how to eliminate Vec & String as part of this feature, on the side of all my other work.

I do really want to see this happen though, as we are basing our entire product-line with this as a core crate, and we see it as a strict requirement to eliminate run-time allocators, as 100% uptime and predictable memory outcomes is important for us.

Do you have any idea on how to combat these? Right now, we are considering forking internally to rid the requirement of not breaking the API, and thus allowing us to replace it all with something like heapless. But i would much rather if we could figure out a way to get this upstream into this crate!

I can add something like heapless with generic lengths and a sensible default value on each of them, eg.

/// Connect packet ([MQTT 3.1]).
///
/// [MQTT 3.1]: http://docs.oasis-open.org/mqtt/mqtt/v3.1.1/os/mqtt-v3.1.1-os.html#_Toc398718028
#[derive(Debug, Clone, PartialEq)]
pub struct Connect<
    C = consts::U128,
    U = consts::U128,
    P = consts::U128,
    T = consts::U128,
    M = consts::U128,
> where
    C: ArrayLength<u8>,
    U: ArrayLength<u8>,
    P: ArrayLength<u8>,
    T: ArrayLength<u8>,
    M: ArrayLength<u8>,
{
    pub protocol: Protocol,
    pub keep_alive: u16,
    pub client_id: String<C>,
    pub clean_session: bool,
    pub last_will: Option<LastWill<T, M>>,
    pub username: Option<String<U>>,
    pub password: Option<Vec<u8, P>>,
}

With more sensible defaults of course. I don't think its pretty, but it allows using Connect without explicitly specifying the generics, while still allowing to overwrite them in case a user is not happy with the initial max-length limitations.

EDIT:

The default values might not be needed??

EDIT 2:
After testing it out, it turns out the user-side of this approach is terrible..
It results in

) -> Result<Packet, StateError> {
            ^^^^^^ expected 8 type arguments

Everywhere! I probably should have foreseen that.. I am running a bit low on ideas for a decent user-facing API

@00imvj00
Copy link
Owner

00imvj00 commented Jun 5, 2020

How about just one allocation per packet?? I tried hard, but I think we can achieve that with one allocation easily.

The idea is, store the whole package as a stream of bytes inside Packet, then call methods to convert those bytes to appropriate fields.

This seems like a viable option to provide some consistency in allocation.

Another option is to receive vector from outside and then own it, but that also requires allocation outside.

encode(&mut buffer: u8[], internal: Vec<u8>)

what above function will do is, it will copy single packet length bytes from buffer into internal vector and then own it.

This was the closest i came to keeping it clean as well as minimal allocation design.

@MathiasKoch
Copy link
Contributor Author

I think that could work, if it would be possible to pass in the internal as a &mut [u8] instead, maybe something like

encode_slice(&mut buffer: u8[], internal: &mut [u8])
encode(&mut buffer: u8[]) {
    ...
    encode_slice(buffer, ..)
}

To allow alloc users to have automatic allocation?

another option to &mut [u8] would be impl BufMut?

@00imvj00
Copy link
Owner

00imvj00 commented Jun 5, 2020

Yes, this can also work. if you can experiment with this and post if this is viable option.

@MathiasKoch
Copy link
Contributor Author

MathiasKoch commented Jun 5, 2020

I can try, but i think

The idea is, store the whole package as a stream of bytes inside Packet, then call methods to convert those bytes to appropriate fields.

Is the part i will find a bit tricky atm.

Do you have a branch or anything like, where you played with this?

@00imvj00
Copy link
Owner

00imvj00 commented Jun 5, 2020

Not at the point.

So, I will tell you the idea.

Instead of parsing in advance, we will store the stream of bytes received from the socket as it is.

Once, someone call Packet.qos(), we can parse stream and get only Qos Part.

Sort of like lay parsing. This might also save us CPU as we are doing eager parsing.

@MathiasKoch
Copy link
Contributor Author

Ahh.. I think that makes great sense!
I will try to give it a shot during the coming week.

@jordens
Copy link

jordens commented Jun 5, 2020

There is managed (used heavily in smoltcp) to abstract over heap/stack ownership. But I don't know whether it helps here.

@MathiasKoch
Copy link
Contributor Author

After having a small look at this, i am not convinced it is something i can do at the moment. This would be a COMPLETE rewrite of the crate.. It might even be faster to start from scratch...

Once, someone call Packet.qos(), we can parse stream and get only Qos Part.

This would mean that all the Packets (Connect, Subscribe, etc.) cannot be structs holding data, as there are nothing but an internal byte buffer, until one calls a method on the packet..

So basically neither the implementations of the crate, nor the definitions of structs would be reusable?

@00imvj00
Copy link
Owner

00imvj00 commented Jun 8, 2020

@MathiasKoch yes. this might require lots of changes. You can experiment, if all goes good we can version bump.

But, are you positive about the idea?? if yes, we can experiment atleast and see how it pans out.

@MathiasKoch
Copy link
Contributor Author

It just turned out to be a larger change that i anticipated.

That said i DO think it could work out, and also with a lot of benefits over the current approach, both in regards to alloc-free, but also the lay parsing, and one step closer to zero-copy.

I think the first problem would be to work out what the API from a user facing side would look like..

I think the entire encode part would be okay as-is? assuming we change the structs it uses, to be slice based instead, for topic names, client id etc., as that would mean the full encode half of the crate would be alloc-free, and actually also closer to zero-copy..

So it's the decode half that needs to rewritten somehow. The sad part is that the decode half wouldn't be able to use the same structs as the encode, but i guess that can be somewhat fixed by implementing From across everything, and using a lot of Into<> trait bounds?

@00imvj00
Copy link
Owner

00imvj00 commented Jun 8, 2020

Yes, even I was thinking about From and Into.

If not, we can have to decode something like.

let mut packet = Packet::new();
let buffer = [u8;];
fn decode(&mut buffer, &mut packet);

here, we are creating a packet outside only, which will perform the required single allocation.

After that, we will copy the required buffer slice from the buffer into the packet.

This is also if we can somehow use Into.

This can potentially be allocation free.

@MathiasKoch
Copy link
Contributor Author

I fixed the Suback::new(), Unsubscribe::new(), Subscribe::new() functions, by introducing heapless as a new dependency. It is not perfect, but i couldn't find any other way of doing it, as it requires at least one allocation no matter how.

I chose to set a limit (currently 5), on how many topics it is possible to subscribe and unsubscribe to in a single packet, if std is not available. I think this is very reasonable, given that it is only applicable in no_std contexts.

@MathiasKoch MathiasKoch marked this pull request as ready for review June 12, 2020 07:50
@MathiasKoch
Copy link
Contributor Author

MathiasKoch commented Jun 16, 2020

Found one last bump on the road: https://github.com/tokio-rs/bytes/blob/a3304e8b8b74e2d5260d8419b6d8075e004e4c39/src/lib.rs#L84

Any suggestions? Or any reasons to stick with the bytes dependency, as it is purely used for the BufMut trait now? Would make sense to find a similar dependency more suited, or an alternative all together? We could rewrite the encode part to the same buf: &'a mut [u8], offset: &mut usize format, with a helper function taking a BufMut in case alloc is available? That would also give a fully consistent API wrt. encode/decode being the same.

EDIT:
I did an experiment on what above encode API could look like here: https://github.com/MathiasKoch/mqttrs/tree/wip/alloc-free-encode

It is working as expected, and i think it is a somewhat nice API.. It might even make sense later on to take it one step further, and take in anything that implements Into to a mutable iterator over u8 instead, just calling .next() everywhere offset is incremented now.. That would allow for streaming data, and would still not break any of the current usecases..

Unfortunately that wont be possible on the decoding in the current form, due to the shared memory references.. I'm not quite sure how that would be handled, if the iterator/streaming way is on the wishlist eventually.

BUT: what do would think about the current buf, offsetimplementation? I think it still needs some testing with Bytes/BytesMut, to verify it works..

… removal of Bytes as dependency, in case of no_std where an allocator is not available, as Bytes requires an allocator
@00imvj00
Copy link
Owner

00imvj00 commented Jul 1, 2020

@MathiasKoch any updates on changes requestd?

@MathiasKoch
Copy link
Contributor Author

@00imvj00 Changes requested?

You mean the ones @vincentdephily made 14 commits ago? They were addressed in the first of said 14 commits.. But if you read my comments, there is some stuff up for discussion.

@MathiasKoch MathiasKoch changed the title First swing at putting alloc behind a feature flag. Make alloc optional, defaulting to slice based API Nov 15, 2020
@MathiasKoch
Copy link
Contributor Author

What can we do to bring this forward? I have been running it in production for quite some time now, with no issues at all, and i think especially the zero-copy properties of this PR is really nice!

@00imvj00 00imvj00 merged commit 159c8e6 into 00imvj00:master Nov 30, 2020
@00imvj00
Copy link
Owner

@MathiasKoch merged. thanks. !!

@00imvj00
Copy link
Owner

Should we bump the verison? @MathiasKoch

@MathiasKoch
Copy link
Contributor Author

@00imvj00 Yeah, i would do so, as it might bring unforeseen breaking stuff.. Better safe than sorry

@MathiasKoch
Copy link
Contributor Author

@00imvj00 Can you also release 0.4.0 to crates? :) Thanks man!

@00imvj00
Copy link
Owner

yeah, will release it in crates by tomorrow.

@00imvj00
Copy link
Owner

00imvj00 commented Dec 8, 2020

@MathiasKoch i think adding lifetimes in packet adds more complexity when wiriting async code in rust. I think we might have to come up with alternative, innovative solution.

The bottomline is, we want to only do single allocation per packet and we don't want to use lifetimes.

Can we achieve something like this?

I am leaning towards the idea of (offset, length) storage. i know it is bit not simple, but i think we can achieve the goals with that, i will try to come up with impl for that.

Apart from that, we might also want to get rid of BytesMut dependency.

We can try to have our own trait like MqttRead: Read, MqttWrite: Write, which users of the lib will have to implement, we will take care of all the moethods internally. Users will just have to write the declaration.

thoughts? @vincentdephily

@MathiasKoch
Copy link
Contributor Author

Ahh, yeah i did not think about async.

I would be sad to see something like MqttRead: Read, MqttWrite: Write though, as Read / Write is not available in #[no_std].

That said i think it should be possible, but would probably take a look towards some of the bigger libraries that support no_std, no_alloc and still supports async fully, for their implementation details?

@00imvj00
Copy link
Owner

00imvj00 commented Dec 8, 2020

Yeah. that would be great.

One of the thing we totally missed was async. Today I started working on demo programs and there was wired lifetime issue, and It stuck, it is killing the ergonomics of the whole thing and making it as wired as it was before.

I think we might have to push towards ease of use also.

If we can crack this right, we can easily apply this method to all sorts of wired protocols.

@MathiasKoch
Copy link
Contributor Author

Can you bring a snippet of async code where this poses an issue? Just to make sure we are actually addressing the same issue :)

@00imvj00
Copy link
Owner

@MathiasKoch sure i will. I think the issue is happening due to lifetime parameter in the Packet type. We need a way to get rid of that. I think i have figured out a way. not pretty. but will work. experimenting on that.

@stabler
Copy link

stabler commented Jan 21, 2021

The lifetime change is breaking my code! I can't pass Packet across threads anymore.

At minimum, can we add an owned version of the packet, and API to convert between the two (similar to Cow)? Or is there another way to manipulate owned Packet objects that I'm not seeing?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support alloc-free operation
5 participants