Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve Base64 performance #185

Merged
merged 3 commits into from
Jun 23, 2024
Merged

Conversation

Jakski
Copy link
Contributor

@Jakski Jakski commented Jun 21, 2024

Thank you for maintaining Spork. It's very useful in everyday scripting and Base64 implementation allowed me to quickly prototype my ideas.

Upon profiling my code I've discovered that most of time is spent in Base64 functions and I'm not even processing that big files. For example encoding like this:

(import spork/base64)
(let [start (os/clock :monotonic)]
  # Imagine a file with 4000 lines each 80 characters long, so about 316 kilobytes
  (base64/encode (string/repeat @"A" (* 80 4000)))
  (print (- (os/clock :monotonic) start)))

takes 5.7 seconds on my machine and decoding like this:

(import spork/base64)
(let [start (os/clock :monotonic)]
  # Imagine a file with 4000 lines each 80 characters long, so about 316 kilobytes
  (base64/decode (string/repeat @"A" (* 80 4000)))
  (print (- (os/clock :monotonic) start)))

takes 2.7 seconds on my machine.

I tried to profile the current implementation, but finding a single bottleneck is very hard. It seems like the most time is spent in converting arrays between sextets, octets, decimals and binaries.

I've re-implemented Base64 with the following ideas:

  • Output buffers are allocated once and with final size
  • Instead of breaking input into arrays, final values are calculated using bitwise operations
  • Explicit error is thrown upon discovering unknown character
  • Length is checked, so invalid encoded input like A will cause error

As the result decoding dropped to 0.19 second and encoding to 0.07 second.

spork/base64.janet Outdated Show resolved Hide resolved
@bakpakin
Copy link
Member

Looks good, I haven't used the base64 module so I didn't realize it was so slow.

Maybe we could just implement it in C for performance?

@Jakski
Copy link
Contributor Author

Jakski commented Jun 21, 2024

Implementing in C sounds good. I'll mark this PR as a draft for now and prepare required changes.

@Jakski Jakski marked this pull request as draft June 21, 2024 23:19
@sogaiu
Copy link
Contributor

sogaiu commented Jun 22, 2024

It is nice to be able to use the base64 module as pure-janet as it can just be copy-pasted into a project where there is no use of C.

I based an implementation of base32 off of it here.

May be it can be kept somewhere -- as an example may be? Or may be there can be a kind of pure-janet collection of some sorts?

@pepe
Copy link
Member

pepe commented Jun 22, 2024

Fyi I have cjanet implementation in https://git.sr.ht/~pepe/gp/tree/master/item/cjanet/codec.janet

@Jakski Jakski mentioned this pull request Jun 23, 2024
@bakpakin bakpakin marked this pull request as ready for review June 23, 2024 18:34
@Jakski
Copy link
Contributor Author

Jakski commented Jun 23, 2024

@sogaiu I see your point. I've created a separate package with pure Janet implementation janet-lang/pkgs#66

Latest commits in this PR include optimized C version. I don't have much experience with Janet C API, so any feedback is welcome.

@bakpakin
Copy link
Member

This looks pretty good @Jakski , are you going to make any more changes or can I merge it? The code looks pretty clean and the C-API usage looks pretty good. As a convention, for encoding functions I often add an optional buffer to pass in so you can encode directly into an existing buffer.

@Jakski
Copy link
Contributor Author

Jakski commented Jun 23, 2024

I see. I don't plan any more changes, so it's ready for merging.

@bakpakin bakpakin merged commit d024557 into janet-lang:master Jun 23, 2024
1 check passed
@pepe
Copy link
Member

pepe commented Jun 25, 2024

I guess https://github.com/janet-lang/spork/blob/master/spork/init.janet#L4 needs to be removed. And maybe add the native module at the bottom of the same file?

@sogaiu
Copy link
Contributor

sogaiu commented Jun 25, 2024

I would have thought so too, along with something around here.

(use spork) seems to work for me oddly enough (with the latest installed), however, doing (use ./spork) from the root directory of a cloned spork repository gives the following here:

(use ./spork)
error: could not find module ./base64:
    spork/base64.jimage
    spork/base64.janet
    spork/base64/init.janet
    spork/base64.so
  in require-1 [boot.janet] on line 3017, column 20
  in import* [boot.janet] on line 3061, column 15
  in thunk [spork/init.janet] (tail call) on line 4, column 1
  in dofile [boot.janet] (tail call) on line 2992, column 7
  in source-loader [boot.janet] on line 3004, column 15
  in require-1 [boot.janet] on line 3028, column 18
  in import* [boot.janet] (tail call) on line 3061, column 15

@pepe
Copy link
Member

pepe commented Jun 25, 2024

Are you sure you do not have some remnants of the old version? I will be AFK for at least two days, but I encountered the problem in fresh environment

@sogaiu
Copy link
Contributor

sogaiu commented Jun 25, 2024

I did jpm uninstall to check, so I think so, but I can check again.

@sogaiu
Copy link
Contributor

sogaiu commented Jun 25, 2024

Below is a log of what I did.

May be I'm missing something...

$ jpm uninstall
removing /home/user/.local/lib/janet/spork
removing /home/user/.local/bin/janet-format
removing /home/user/.local/bin/janet-netrepl
removing /home/user/.local/lib/janet/spork/json.so
removing /home/user/.local/lib/janet/spork/json.meta.janet
removing /home/user/.local/lib/janet/spork/json.a
removing /home/user/.local/lib/janet/spork/rawterm.so
removing /home/user/.local/lib/janet/spork/rawterm.meta.janet
removing /home/user/.local/lib/janet/spork/rawterm.a
removing /home/user/.local/lib/janet/spork/crc.so
removing /home/user/.local/lib/janet/spork/crc.meta.janet
removing /home/user/.local/lib/janet/spork/crc.a
removing /home/user/.local/lib/janet/spork/utf8.so
removing /home/user/.local/lib/janet/spork/utf8.meta.janet
removing /home/user/.local/lib/janet/spork/utf8.a
removing /home/user/.local/lib/janet/spork/tarray.so
removing /home/user/.local/lib/janet/spork/tarray.meta.janet
removing /home/user/.local/lib/janet/spork/tarray.a
removing /home/user/.local/lib/janet//tarray.h
removing /home/user/.local/lib/janet/spork/zip.so
removing /home/user/.local/lib/janet/spork/zip.meta.janet
removing /home/user/.local/lib/janet/spork/zip.a
removing /home/user/.local/lib/janet/spork/cmath.so
removing /home/user/.local/lib/janet/spork/cmath.meta.janet
removing /home/user/.local/lib/janet/spork/cmath.a
removing /home/user/.local/lib/janet/spork/base64.so
removing /home/user/.local/lib/janet/spork/base64.meta.janet
removing /home/user/.local/lib/janet/spork/base64.a
removing manifest /home/user/.local/lib/janet/.manifests/spork.jdn
Uninstalled.
$ janet
Janet 1.35.2-a5d6b228 linux/x64/gcc - '(doc)' for help
repl:1:> (use spork)
error: could not find module spork:
    /home/user/.local/lib/janet/spork.jimage
    /home/user/.local/lib/janet/spork.janet
    /home/user/.local/lib/janet/spork/init.janet
    /home/user/.local/lib/janet/spork.so
  in require-1 [boot.janet] on line 3017, column 20
  in import* [boot.janet] (tail call) on line 3061, column 15
repl:2:> 
$ jpm install
generating /home/user/.local/lib/janet/.manifests/spork.jdn...
Installed as 'spork'.
copying spork to /home/user/.local/lib/janet...
installing bin/janet-format to /home/user/.local/bin/janet-format
installing bin/janet-netrepl to /home/user/.local/bin/janet-netrepl
copying build/spork/json.so to /home/user/.local/lib/janet/spork...
copying build/spork/json.meta.janet to /home/user/.local/lib/janet/spork...
copying build/spork/json.a to /home/user/.local/lib/janet/spork...
copying build/spork/rawterm.so to /home/user/.local/lib/janet/spork...
copying build/spork/rawterm.meta.janet to /home/user/.local/lib/janet/spork...
copying build/spork/rawterm.a to /home/user/.local/lib/janet/spork...
copying build/spork/crc.so to /home/user/.local/lib/janet/spork...
copying build/spork/crc.meta.janet to /home/user/.local/lib/janet/spork...
copying build/spork/crc.a to /home/user/.local/lib/janet/spork...
copying build/spork/utf8.so to /home/user/.local/lib/janet/spork...
copying build/spork/utf8.meta.janet to /home/user/.local/lib/janet/spork...
copying build/spork/utf8.a to /home/user/.local/lib/janet/spork...
copying build/spork/tarray.so to /home/user/.local/lib/janet/spork...
copying build/spork/tarray.meta.janet to /home/user/.local/lib/janet/spork...
copying build/spork/tarray.a to /home/user/.local/lib/janet/spork...
copying src/tarray.h to /home/user/.local/lib/janet/...
copying build/spork/zip.so to /home/user/.local/lib/janet/spork...
copying build/spork/zip.meta.janet to /home/user/.local/lib/janet/spork...
copying build/spork/zip.a to /home/user/.local/lib/janet/spork...
copying build/spork/cmath.so to /home/user/.local/lib/janet/spork...
copying build/spork/cmath.meta.janet to /home/user/.local/lib/janet/spork...
copying build/spork/cmath.a to /home/user/.local/lib/janet/spork...
copying build/spork/base64.so to /home/user/.local/lib/janet/spork...
copying build/spork/base64.meta.janet to /home/user/.local/lib/janet/spork...
copying build/spork/base64.a to /home/user/.local/lib/janet/spork...
$ janet
Janet 1.35.2-a5d6b228 linux/x64/gcc - '(doc)' for help
repl:1:> (use spork)
@{_ @{:value } argparse/argparse @{:private true} base64/decode @{:private true} base64/encode @{:private true} crc/make-variant @{:private true} crc/named-variant @{:private true} cron/check @{:private true} cron/next-timestamp @{:private true} cron/parse-cron @{:private true} ev-utils/go-nursery @{:private true} ev-utils/join-nursery @{:private true} ev-utils/multithread-service @{:private true} ev-utils/nursery @{:private true} ev-utils/pcall @{:private true} ev-utils/pdag @{:private true} ev-utils/pmap @{:private true} ev-utils/pmap-full @{:private true} ev-utils/pmap-limited @{:private true} ev-utils/spawn-nursery @{:private true} ev-utils/wait-cancel @{:private true} fmt/*user-indent-2-forms* @{:private true} fmt/format @{:private true} fmt/format-file @{:private true} fmt/format-print @{:private true} generators/concat @{:private true} generators/cycle @{:private true} generators/drop @{:private true} generators/drop-until @{:private true} generators/drop-while @{:private true} generators/filter @{:private true} ...}

FWIW, the first thing that comes up for git log for my local spork clone is:

commit d024557895f327d5527b1ecdd063f71372e9a97e (HEAD -> master, origin/master, origin/HEAD)
Merge: e598ef4 dc1cc19
Author: Calvin Rose <[email protected]>
Date:   Sun Jun 23 11:58:13 2024 -0700

    Merge pull request #185 from Jakski/faster-base64

@sogaiu
Copy link
Contributor

sogaiu commented Jun 25, 2024

If I install using (bundle/install ".") and then (use spork), I do get an error:

(use spork)
error: could not find module ./base64:
    /home/user/.local/lib/janet/spork/base64.jimage
    /home/user/.local/lib/janet/spork/base64.janet
    /home/user/.local/lib/janet/spork/base64/init.janet
    /home/user/.local/lib/janet/spork/base64.so
  in require-1 [boot.janet] on line 3017, column 20
  in import* [boot.janet] on line 3061, column 15
  in thunk [/home/user/.local/lib/janet/spork/init.janet] (tail call) on line 4, column 1
  in dofile [boot.janet] (tail call) on line 2992, column 7
  in source-loader [boot.janet] on line 3004, column 15
  in require-1 [boot.janet] on line 3028, column 18
  in import* [boot.janet] (tail call) on line 3061, column 15

@Jakski
Copy link
Contributor Author

Jakski commented Jun 25, 2024

(use spork) seems to work for me oddly enough (with the latest installed), however, doing (use ./spork) from the root directory of a cloned spork repository gives the following here:

I believe this is caused by different layout of installed modules vs local compilation artifacts. When you run jpm build compiled artifacts are placed in local ./build/ directory, but when you do jpm install then *.janet and *.so files are placed in the same directory(~/.asdf/installs/janet/ref-a5d6b2283834422a9fa9e79b5c7ad9b932b52568/lib/janet/spork/ in my case). You can access compiled modules directly with (use ./build/spork/base64).

Yet I don't understand why do we import native and non-native dependencies differently in https://github.com/janet-lang/spork/blob/master/spork/init.janet#L4 It seems that both versions work the same, once module is installed.

If I install using (bundle/install ".") and then (use spork), I do get an error:

I confirm. It happens to me as well with the latest Janet. I've just caught up with discussion from janet-lang/janet#1444 and it seems to me that we will need to support both methods of installation. As far as I understand bundle/ functions are not supposed to read project.janet(correct me on this one). They use https://github.com/janet-lang/spork/blob/d024557895f327d5527b1ecdd063f71372e9a97e/bundle/init.janet instead.

@sogaiu
Copy link
Contributor

sogaiu commented Jun 26, 2024

Yet I don't understand why do we import native and non-native dependencies differently in https://github.com/janet-lang/spork/blob/master/spork/init.janet#L4 It seems that both versions work the same, once module is installed.

I don't know either.

FWIW, looks like the pattern of doing the imports differently may have started here.

bundle/ functions are not supposed to read project.janet

I think that's correct.

AFAIU, project.janet is a jpm-only thing and bundle/* is an "out-of-the-box" janet thing.

@sogaiu
Copy link
Contributor

sogaiu commented Jul 11, 2024

Just adding a note here in case someone faces a situation where a pure-Janet piece gets replaced with a native one and they are looking for possible ways forward.


I found that janet-playground had broken as it relied on being able to make-image the result of (require "spork").

I believe this worked in the days when there were no native modules in spork (at least it worked in mid-2022), but at present, it doesn't.

Here is some related detail. There seemed to be an easy work-around in that case -- just (require "spork/fmt") instead.

The reason I bring this up in this PR is that a pure-Janet piece got replaced with a native one and had it been spork/fmt, the specific work-around mentioned above would not have sufficed.

Other work-arounds that could be considered include:

  • Depend on an earlier specific version of spork
  • Vendor spork or a portion of it
  • Use an alternative implemention (e.g. in this PR, the author was kind enough to host a pure-Janet version)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants