Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Call toJSON() on non-ext objects when it exists #188

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

kevincennis
Copy link

@kevincennis kevincennis commented Aug 21, 2021

Currently, @msgpack/msgpack does not automatically call .toJSON() on objects which define that method.

Since JSON and msgpack have significant overlap in semantics and intent, and toJSON() is a signal from an object that says "this is how I want to be serialized", I believe it makes sense to use toJSON() when it exists (after exhausting possible extensions).

The concrete use-case for me was mongodb.ObjectId – which returns a simple string from toJSON(), but otherwise looks like:

{
  _bsontype: string,
  id: Uint8Array
}

Effectively, this change helps msgpack more closely match the behavior of JSON.stringify().


It wasn't entirely clear to me where these tests should go. There's not much in the suite for testing the encode side. Most of the tests seem to sort of implicitly test encoding while calling themselves decode tests, so I've stuck with that convention here.

@kevincennis
Copy link
Author

kevincennis commented Aug 21, 2021

As prior art, I'll submit this commit in msgpack/msgpack-node. I'd use that instead, but it doesn't appear to be actively maintained and it won't build on alpine linux, which I don't totally feel like digging in to.

@codecov-commenter
Copy link

codecov-commenter commented Aug 23, 2021

Codecov Report

Merging #188 (1b6f0b6) into main (9770004) will increase coverage by 0.00%.
The diff coverage is 100.00%.

❗ Current head 1b6f0b6 differs from pull request most recent head b04a351. Consider uploading reports for the commit b04a351 to get more accurate results
Impacted file tree graph

@@           Coverage Diff           @@
##             main     #188   +/-   ##
=======================================
  Coverage   98.13%   98.14%           
=======================================
  Files          16       16           
  Lines         964      968    +4     
  Branches      206      208    +2     
=======================================
+ Hits          946      950    +4     
  Misses         18       18           
Impacted Files Coverage Δ
src/Encoder.ts 98.01% <100.00%> (+0.03%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9770004...b04a351. Read the comment docs.

@gfx
Copy link
Member

gfx commented Aug 23, 2021

Thank you for your effort, but I'm not sure it's useful for everyone, because decode() cannot know how to retrieve the encoded data. Instead, I recommend using type extensions, which cover both encode() and decode() for custom types.

@kevincennis
Copy link
Author

https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/JSON/stringify#tojson_behavior

I guess my argument would be that decode( encode( x ) ) should generally look like JSON.parse( JSON.stringify( x ) ), unless msgpack explicitly knows how to do better.

JavaScript has a native way for objects to define how they should be serialized. It’s hard for me to wrap my head around why you don’t think it’s reasonable to use that in situations where a custom extension hasn’t been defined.

@kevincennis
Copy link
Author

kevincennis commented Aug 23, 2021

Just want to include a few links here of popular libraries that implement toJSON for situations where some JS object may exist which have a number of internal properties relating to its implementation that are irrelevant to the actual data it represents.

The important thing to bear in mind is that the receiver of some msgpack message might not be written in JavaScript. So there isn't necessarily even an equivalent object on the other end if you were to use an extension (e.g. Rust and Python obviously don't have moment.js objects). These are all good examples of that concept. toJSON gives me the underlying data without the baggage of shipping around internal implementation details.

https://github.com/moment/moment/blob/e96809208c9d1b1bbe22d605e76985770024de42/src/lib/moment/prototype.js#L66

https://github.com/Automattic/mongoose/blob/master/lib/document.js#L3794

https://github.com/immutable-js/immutable-js/blob/cb534f07b4187c1c32cd1a166dec257f55f28c51/src/Record.js#L230

@gfx
Copy link
Member

gfx commented Sep 1, 2021

Hmm, still I'm not sure toJSON is reliable for general-purpose serializers/deserializers . It's reasonable that some objects like moment.js's provides toJSON because it's used not only for serialization but also for formatting for humans. On the other hand, some native objects do not have toJSON, for example Map and Set.

I rather think that I can add another option to make warnings about objects which seems not to be serializable. That is, their prototypes are not Object.prototype.

@kevincennis
Copy link
Author

I still think you’re not understanding what toJSON is actually for.

Moment does not include that for “formatting” purposes. It has a format method.

toJSON is specifically meant to be called when an object is being passed (either directly or nested inside another value) to JSON.stringify. It’s a hint that says “this is how I want you to serialize me”.

I don’t understand what you see as the risk to merging this. You don’t lose any functionality, and you make it easier for users to serialize arbitrary non-POJOs - because, again, toJSON is the standard way (defined in the ECMA spec) for objects to define how they should be serialized.

@spmurrayzzz
Copy link

spmurrayzzz commented Sep 3, 2021

Just wanted to cosign what @kevincennis is talking about here (and also have use cases as well for this change). This is very explicit behavior defined in the ECMAScript specification. The SerializeJSONProperty is the tail call to the JSON.stringify() method and exists for the explicit purpose of providing a standard interface for API consumers to define how they want their objects serialized. Whether those consumers want to implement those interfaces is up to them.

Its fine if you don't think this change is necessary for the library, but it fundamentally limits an interoperability path that is well-defined in the language specification. The change proposed in this PR just makes this particular flaw happen less frequently for folks who choose to opt-in to the standards-compliant method of defining their serialization procedures.

On the other hand, some native objects do not have toJSON, for example Map and Set.

Those structures don't have toJSON() methods because they aren't really intended to be serializable for a number of reasons. For Map() specifically, having object keys that are references limits your ability to serialize them in a generic way. Some folks may want to serialize the output of .entries(), others may not. I don't see this as a persuasive argument that the proposed changes in the PR diff are in any way unreliable by design.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants