Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Hibernating Websockets API #434

Closed
wants to merge 5 commits into from

Conversation

DylanRJohnston
Copy link
Contributor

@DylanRJohnston DylanRJohnston commented Jan 13, 2024

Hey all, this contains my attempt at writing the bindings for the Hibernating Web Sockets API. I haven't implemented the setWebSocketAutoResponse suite of functions yet as I don't have immediate use for them and I'd like feedback on the implementation so far.

It might also be worth extending the #[durable_objects] macro to support the new methods on the Durable Object required to make it work and get the wrapped typesworker::WebSocket instead of web_sys::WebSocket. However it's pretty easy to make it work directly with #[wasm_bindgen] for now and I've only written basic procedural macros in the past. Done.

@kflansburg
Copy link
Contributor

Thanks for this @DylanRJohnston, I think this looks good, but want @zebp to review the wasm bindgen stuff, and I will find someone to review the hibernation API itself.

@DylanRJohnston
Copy link
Contributor Author

DylanRJohnston commented Jan 14, 2024

So I've run into a bit of interesting usability problem. I've been following this example of how to use Hibernatable web sockets.

The function signature of webSocketMessage , (&self, incoming_socket: web_sys::WebSocket, message: String) only gives the WebSocket as a way of identifying "who" the message came from.

This is fine in JS, because objects can be map keys so you can use the WebSocket itself as a "key", and the worked example does exactly that, storing sessions in a map from WebSocket to session to data.

However in Rust, HashMap keys must implement std::hash::Hash, however since all js_sys::Object's deref into wasm_bindgen::JsValue and wasm_bindgen::JsValue is just a u32 "pointer" it is possible in theory to use any js_sys::Object as a map key. rustwasm/wasm-bindgen#1766 (comment) provides an interesting work around using std::mem::transmute. However there's a lot of discussion in the wasm_bindgen issues that they're not keen to expose and stabilize this fact so it could break in the future. The discussion at this point is quite old and it hasn't changed yet however.

// This is a hack until wasm_bindgen's API settles around `anyref`, see
// https://github.com/rustwasm/wasm-bindgen/issues/999
#[derive(Clone, Copy, Debug, PartialEq, Eq, PartialOrd, Ord, Hash)]
pub struct JsId {
    id: u32,
}

impl JsId {
    pub fn from_value(value: JsValue) -> JsId {
        unsafe {
            JsId {
                id: mem::transmute(value),
            }
        }
    }
}

So the question is, should this remain an esoterric work-around that people have to track down? Should they just settle for a linear scan of a Vec<worker::WebSocket> (since websys::WebSocket implements PartialEq) or would you be open to providing an implementation for Hash directly in worker-rs to improve the ergonomics of working with the hibernatable web sockets API?

@DylanRJohnston
Copy link
Contributor Author

Larger worked example for context

#[derive(Debug)]
pub struct User {
    pub metadata: Metadata,
    pub socket: worker::WebSocket,
}

#[durable_object]
pub struct Example {
    count: usize,
    state: State,
    env: Env,
    users: HashMap<JsId, User>,
}

#[durable_object]
impl DurableObject for Example {
    fn new(state: State, env: Env) -> Self {
        // Recover listeners from hibernation
        let recovered_sockets = state.get_web_sockets();

        console_log!(
            "Reloaded durable object, {} listeners recovered",
            recovered_sockets.len()
        );

        let mut users = HashMap::new();

        for listener in recovered_sockets.iter() {
            match listener.deserialize_attachment::<Metadata>() {
                Ok(Some(metadata)) => {
                    console_log!("Found metadata {:?}", metadata);

                    users.insert(
                        JsId::from_value(listener.as_ref().into()),
                        User {
                            metadata,
                            socket: listener.clone(),
                        },
                    );
                }
                Ok(None) => console_log!("No metadata found"),
                Err(err) => console_error!("Metadata failed to load: {}", err.to_string()),
            }
        }

        Self {
            count: 0,
            state,
            env,
            users,
        }
    }

@DylanRJohnston
Copy link
Contributor Author

DylanRJohnston commented Jan 14, 2024

Actually after more experimentation this doesn't work as expected, Object.is on which PartialEq is built doesn't seems to care about the idx of the objects, so I guess we're stuck with Linear scans of Vec, which will be fine for low numbers of sessions.

        for (socket, session) in self.sessions.iter() {
            if socket.as_ref() == &ws {
                console_error!(
                    "Found Match for {}: {:?} is equal to {:?}",
                    session.metadata.username,
                    JsId::from_value(socket.as_ref().into()),
                    JsId::from_value(socket.as_ref().into())
                );
            }
        }
Found Match for User 2: JsId { id: 133 } is equal to JsId { id: 150 }

Alternatively, you could use deserializeAttachment, but I think you'd have to get to a very high number of sessions before deserializing a key to look up in a map is faster than just a linear scan of a vector.

Maybe a future API of the Hibernatable Web Sockets could be an id() method that yields some opaque identifier for the WebSocket?

@DylanRJohnston
Copy link
Contributor Author

Just added support to the durable_object macro for the new methods required for the Hibernatable Web Sockets API.

@@ -32,6 +41,8 @@ pub fn expand_macro(tokens: TokenStream) -> syn::Result<TokenStream> {
"new" => {
let mut method = impl_method.clone();
method.sig.ident = Ident::new("_new", method.sig.ident.span());
method.vis = Visibility::Inherited;
Copy link
Contributor Author

@DylanRJohnston DylanRJohnston Jan 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I spent two hours trying to debug a problem caused by adding pub to any of the methods in the impl. It causes them to get also picked up by the #[wasm_bindgen] declaration which generates extremely obtuse errors about WSAM ABI traits not being satisfied. Given that the wrapped method is public and is exported to JS I think it makes sense to allow, if not enforce that these methods are pub.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If not forcing the visibility to Inherited we should at least generate a helpful error messages that the visibility cannot be pub.

let static_self: &'static mut Self = unsafe {&mut *(self as *mut _)};

wasm_bindgen_futures::future_to_promise(async move {
static_self._on_message_raw(ws.into(), message).await.map(|_| wasm_bindgen::JsValue::NULL)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to your docs this method is supposed to return void so upon reflection I think maybe JsValue::UNDEFINED is the correct type here.

However the docs also say alert is also void, but you currently have it typed as worker::Response, so I'm not sure what it should be.

Should it match alert? Or should alert be changed to match the documentation and return void / JsValue::UNDEFINED?

Ok(quote! {
#wasm_bindgen_attr
impl #struct_name {
#(#tokenized)*
}

#pound[async_trait::async_trait(?Send)]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

async_trait isn't required anymore, but if you wanted to still support older versions of Rust I can add it back in.

Comment on lines +201 to +205
pub fn serialize_attachment<T: Serialize>(&self, value: T) -> Result<()> {
self.socket
.serialize_attachment(serde_wasm_bindgen::to_value(&value)?)
.map_err(Error::from)
}
Copy link
Contributor Author

@DylanRJohnston DylanRJohnston Jan 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used the Storage get / set API as prior art for this, let me know if there's a better approach.

If I understand this correctly, it converts it to a string before it gets sent to the JS serialize attachment method, meaning it gets "double serialised". The more "correct" apporach might be to have the trait bound to ensure it meets the "Structural Copy" requirement, but then just pass the struct to JS as is for serialization?

Not sure how deerialization would work though. So perhaps the double serialization is the best way to go.

Comment on lines +162 to +180
quote! {
#pound[wasm_bindgen::prelude::wasm_bindgen(js_name = webSocketClose)]
pub fn _on_close(&mut self, ws: worker_sys::web_sys::WebSocket, code: usize, reason: String, was_clean: bool) -> js_sys::Promise {
// SAFETY:
// On the surface, this is unsound because the Durable Object could be dropped
// while JavaScript still has possession of the future. However,
// we know something that Rust doesn't: that the Durable Object will never be destroyed
// while there is still a running promise inside of it, therefore we can let a reference
// to the durable object escape into a static-lifetime future.
let static_self: &'static mut Self = unsafe {&mut *(self as *mut _)};

wasm_bindgen_futures::future_to_promise(async move {
static_self._on_close_raw(ws.into(), code, reason, was_clean).await.map(|_| wasm_bindgen::JsValue::NULL)
.map_err(wasm_bindgen::JsValue::from)
})
}

#method
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
quote! {
#pound[wasm_bindgen::prelude::wasm_bindgen(js_name = webSocketClose)]
pub fn _on_close(&mut self, ws: worker_sys::web_sys::WebSocket, code: usize, reason: String, was_clean: bool) -> js_sys::Promise {
// SAFETY:
// On the surface, this is unsound because the Durable Object could be dropped
// while JavaScript still has possession of the future. However,
// we know something that Rust doesn't: that the Durable Object will never be destroyed
// while there is still a running promise inside of it, therefore we can let a reference
// to the durable object escape into a static-lifetime future.
let static_self: &'static mut Self = unsafe {&mut *(self as *mut _)};
wasm_bindgen_futures::future_to_promise(async move {
static_self._on_close_raw(ws.into(), code, reason, was_clean).await.map(|_| wasm_bindgen::JsValue::NULL)
.map_err(wasm_bindgen::JsValue::from)
})
}
#method
}
quote! {
#pound[wasm_bindgen::prelude::wasm_bindgen(js_name = webSocketMessage)]
pub async fn _on_message(&mut self, ws: worker_sys::web_sys::WebSocket, message: String) -> Result<wasm_bindgen::JsValue, worker::Error> {
self._on_message_raw(ws.into(), message).await.map(|_| wasm_bindgen::JsValue::NULL)
}
#method
}

According to the wasm_bindgen docs it can automatically convert async function to js_sys::Promise and can also automatically handle returns types of Result<T: Into<JsValue>, T: Into<JsValue>> so we can remove 90% of this boilerplate and the static_self "unsafe" rebind.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Further to this, you could implement describe::WasmDescribe, convert::IntoWasmAbi, and convert::FromWasmAbi by delegating to the inner implementations of the inner web_sys::WebSocket type like I did here for Enums which would then basically remove the need for this macro all together I think as you wouldn't need the boilerplate of converting back and forth from the raw web_sys types and the wrapped worker types.

Copy link
Contributor

@eric-seppanen eric-seppanen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are multiple changes in this PR: upgrading dependencies, removing async_trait, the OptionalMethods changes, and adding the hibernatable API.

That makes it harder to read and understand the changes-- perhaps those should be distinct PRs?

@@ -15,4 +15,23 @@ extern "C" {

#[wasm_bindgen(method, js_name=waitUntil)]
pub fn wait_until(this: &DurableObjectState, promise: &js_sys::Promise);

#[wasm_bindgen(method, js_name=acceptWebSocket)]
pub fn accept_web_socket(this: &DurableObjectState, ws: web_sys::WebSocket);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be accept_websocket instead of accept_web_socket. The rest of the code refers to websocket, so adding the underscore seems mismatched.

pub fn accept_web_socket_with_tags(
this: &DurableObjectState,
ws: web_sys::WebSocket,
tags: Vec<String>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps &[&str] would be a bit friendlier here? That way the caller can choose to use literal strings, instead of needing to construct a throwaway Vec.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The use of Vec instead of &[&str] was based on something I read in the wasm-bindgen docs about unboxed slices not being well supported. Although I can't seem to find that in the docs again so perhaps I'm misremembering. I'll try it out tomorrow and report back.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, I tried it in my branch and it seems to work fine.

Copy link
Contributor Author

@DylanRJohnston DylanRJohnston Jan 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really? I'm getting the following error the trait bound '&[&str]: wasm_bindgen::convert::IntoWasmAbi' is not satisfied

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everywhere else in the API where something similar is used, such as get_multiple in the DurableObjectTransaction uses Vec<JsValue>.

pub fn get_multiple(
        this: &DurableObjectTransaction,
        keys: Vec<JsValue>,
    ) -> Result<js_sys::Promise, JsValue>;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although Vec<String> is also giving me issues now, perhaps there was a change between wasm-bindgen 0.2.87 and 0.2.89. The problem only appeared after removing the changes I made to the wasm-bindgen version.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @eric-seppanen can you double check that this is matches your experience?

wasm_bindgen Vec<JsValue> &[JsValue] Vec<String> &[&str]
v0.2.87 ✔️
v0.2.89 ✔️ ✔️
v0.2.90 ✔️ ✔️

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can definitely still have the public api's in worker accept &[&str] but they'll have to transform it to Vec<String> or Vec<JsValue> before calling the methods defined in worker-sys.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, sorry, I misspoke.

What I did was this:

#[wasm_bindgen(method, js_name=acceptWebSocket)]
pub fn accept_websocket(this: &DurableObjectState, ws: &web_sys::WebSocket, tags: &js_sys::Object);

and

pub fn accept_websocket(&self, websocket: &WebSocket, tags: &[&str]) {
    let js_tags = js_sys::Array::new();
    for tag in tags {
        js_tags.push(&js_sys::JsString::from(*tag));
    }
    self.inner.accept_websocket(websocket.as_ref(), &js_tags)
}

So the slice reference was only on the user-facing part.

#[wasm_bindgen(method, js_name=getWebSockets)]
pub fn get_web_sockets_with_tag(
this: &DurableObjectState,
tag: String,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly, I think &str would be better for caller flexibility.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since &str gets cloned between Rust and JS anyway I thought it was a more representative type, but upon reflection using String causes two allocations, one of the Rust side and one on the JS side. I'll update it.

self.inner.accept_web_socket(ws.as_ref().clone())
}

pub fn accept_web_socket_with_tags(&self, ws: WebSocket, tags: Vec<String>) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also here? Vec<String> -> &[&str]

.collect()
}

pub fn get_web_sockets_with_tag(&self, tag: String) -> Vec<WebSocket> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and String -> &str

@@ -109,24 +124,123 @@ pub fn expand_macro(tokens: TokenStream) -> syn::Result<TokenStream> {

#method
}
}
},
"on_message" => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to see this keep the same name that's used in the Cloudflare documentation. So I think this fn should be called websocket_message.


quote! {
#pound[wasm_bindgen::prelude::wasm_bindgen(js_name = webSocketMessage)]
pub fn _on_message(&mut self, ws: worker_sys::web_sys::WebSocket, message: String) -> js_sys::Promise {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Binary websocket messages would deliver a message that's an ArrayBuffer, so I think this needs to handle both types.

This can be handled by having the incoming type be a JsValue, and then that can be converted into an enum depending on its JS type, e.g.

pub enum WebSocketIncomingMessage {
    String(String),
    Binary(Vec<u8>),
}
let ws_message = if let Some(s) = message.as_string() {
    worker::WebSocketIncomingMessage::String(s)
} else {
    let v = js_sys::Uint8Array::new(&message).to_vec();
    worker::WebSocketIncomingMessage::Binary(v)
};

I'm not certain whether the conversions above are infallible, but so far they seem to work for me. (I have a branch where I had been prototyping a similar change.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I wasn't sure what to make of the string | ArrayBuffer type in the docs but I didn't consider binary messages.

@DylanRJohnston
Copy link
Contributor Author

Hey Eric, appreciate the review, apologies for mixing multiple change into this Pull Request it was not intentional. I didn't realise I'd opened the pull request against main. I'll separate just the hibernatable web sockets stuff into its own branch and update the pull request to reflect that.

@MellowYarker
Copy link

Just a heads up, there should be another method added in the next couple of days that lets you retrieve tags (PR merged in Workerd, not sure how long it takes to reach workers-rs). I know this PR doesn't intend on covering the entire API but I figure I might as well let you know in case this is something you're interested in.

Ex (in JS)

let [client, server] = new WebSocketPair();
this.state.acceptWebSocket(server, ["one", "two"]);

let tags = this.state.getTags(server);
// tags == ["one", "two"]

You give it a WebSocket and it returns an Array of Strings, or throws if the websocket wasn't accepted as hibernatable.

@DylanRJohnston
Copy link
Contributor Author

DylanRJohnston commented Jan 17, 2024

@MellowYarker that sounds like a good way to handle the WebSocket -> Session map problem in Rust, you could attach a session id to the socket as a tag, and then retrieve it. Avoids needing the linear scan of the Vec.

Do you know what version of wrangler is required to support that API with wrangler dev?

@MellowYarker
Copy link

It's not in Wrangler quite yet. Looks like we haven't done a Workerd release in a while so I've asked when the next one will go out. I suspect it'll be out some time next week but I'll update back here if I learn anything new.

@DylanRJohnston
Copy link
Contributor Author

Apologies all, but I'm unable to change the branch of this pull request from the main branch of my fork to just the branch containing the Hibernatable Web Sockets changes. Closing this pull request in favour of #436.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants