-
-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: string.base64.parse #1158
base: main
Are you sure you want to change the base?
Conversation
Something I discovered while writing the tests is that I intended to base the tests on the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As of now, my instinct is that the costs of adding the new implementation to the default scope outweigh the benefits, although I'm open to feedback there.
Despite btoa
being painfully slow, it still seems like providing an option that uses builtins in the default scope would be valuable, and perhaps we can follow up with a PR adding this version to an @ark/extras
(name pending) package so people can opt-in?
Also the CI tests should work once you rebase
@@ -43,6 +43,17 @@ contextualize(() => { | |||
attest(b64url("fn5+").toString()).equals( | |||
'must be base64url-encoded (was "fn5+")' | |||
) | |||
|
|||
const b64parse = type("string.base64.parse") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's move this group of tests for base64
to its own file like ip.test.ts
or similar.
@@ -43,6 +43,17 @@ contextualize(() => { | |||
attest(b64url("fn5+").toString()).equals( | |||
'must be base64url-encoded (was "fn5+")' | |||
) | |||
|
|||
const b64parse = type("string.base64.parse") | |||
attest(Buffer.from(b64parse("fn5+") as Uint8Array).toString("utf8")).snap( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally snap would recognize Uint8Array
as a builtin here and not map over it, but seems like you'd still have to convert it to a string to easily snapshot it. This seems fine to me for now unless you want to make a change to snapshotting in attest to add builtin logic for this.
const byteLength = (validLen: number, placeHoldersLen: number) => | ||
((validLen + placeHoldersLen) * 3) / 4 - placeHoldersLen | ||
|
||
const parseB64 = (b64: string) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm. I'm tempted to just say let's use the builtin (btoa) even if it's much slower because people complaining about the perf could just write their own version of the parse that uses one of these libraries, but this is a lot to maintain and add to the bundle size for a perf improvement (albeit significant) on a relatively niche feature.
Another option for this kind of type would be to create a new package like @ark/extras
where they could be imported directly if users need them. If they're builtin to the default scope, there's no way to tree shake it so the cost is high.
I do appreciate the work that went into adapting an optimized version of this, so would rather see it available through a package like that than to go to waste.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd argue that the feature wouldn't be that niche as uploading binary data in the form of a base64 string is fairly common, and I'd hate for people to use string.base64.parse
on the promise that it automatically validates and parses the string into a buffer-like object only to discover it's using an approach that's 10,000% slower than an available alternative (or even worse, not knowing it's an implementation detail and assuming it's arktype itself that is so slow). In my honest opinion, from a user experience perspective, I think it would be better for the feature to not exist at all than exist in such a disadvantageous state.
That said, the maintenance and bundle size arguments are valid, so maybe an @ark/extras
package is warranted if this PR is to go further.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, especially until we can support precompilation as a build step for people who care about bundle size, we have to be pretty conservative about this sort of thing.
If you want to migrate this with the changes mentioned to @ark/extras
, I would merge that (although the name might change before publishing).
This PR adds the
string.base64.parse
keyword submodule which parses a base64 string into aUint8Array
.main
branchpnpm prChecks
locallyNotes:
The parsing method for converting a base64 string into a
Uint8Array
is an adapted solution from the base64-js package sinceBuffer.from
isn't available in browsers andbtoa
is notoriously slow.I ran a NodeJS benchmark comparing this solution with
btoa
andBuffer.from
, and got the following results (the test base64 string was a 31KB image):The
btoa
result isn't surprising, but if these results are to be believed (and that's a huge "if"), the base64-js solution is an order of magnitude faster thanBuffer.from
. I'm not sure I trust that conclusion, but seeing asBuffer.from
isn't an option anyway, I'm willing to take the win.TODO:
string.base64.url.parse
string.base64.parse
handle both base64 and base64url strings (which is sort of already supported by the base64-js source).