Proposal for caching improvements #289

vincentwoo · 2015-08-22T09:08:43Z

Regarding the issues brought up in #259...

Motivation

We'd like to make caching in JBuilder useful. Currently, caching does not offer significant performance benefits for pretty much any app. This is largely due to the serialization and over-the-wire costs JBuilder pays when storing and fetching cached content, because the current serialization scheme doesnt allow for raw partial content. Below is a proposal for enabling much faster caching while still retaining MultiJson as the serialization backend:

Proposal

When json.cache! cache_key, &block is encountered in a template, Jbuilder emits a placeholder UUID key/value pair into the document (like "jbuilder-XXXX": null) and moves on. It stores the (template digested) cache keys in a list of keys to be looked up, along with their corresponding UUID.
After the template is rendered, the half-baked string is held in memory while JBuilder sends a fetch_multi for each of the keys in the set.
For each key that is found in the cache, the corresponding UUID is blindly replaced with the value taken from the cache. In this case, a simple gsub for "jbuilder-XXXX": null, replacing it with whatever the cache has, like: "color": "red", "count": 2
For each key that is NOT found, execute the block with a fresh json object. Once completed, serialize that json object to a string, and strip off the leading and trailing braces. Store that resulting string under the cache key and perform the substitution operation described in (3). The brace stripping must happen because the json.cache! call does not by default happen under a named key, and can be embedded in the a flow of a template, e.g.
```
json.key1 'value1'
json.cache! do
  json.key2 'value2'
end
```
produces { "key1": "value1", "key2": "value2"}

Rendering json.key2 'value2' in isolation produces {"key2": "value2"}, which we then strip of braces, which renders it fit to be inserted into the flow of other JSON objects.

Cons

This is a giant fucking hack.
Currently, the following case works fine in that only one entry is emitted:
```
json.key 'initial value'
json.cache! do
  json.key 'new value'
end
```
But under the proposal described, JBuilder will double-emit entries for key. By and large this will probably not matter - there's no real reason to do this, and most JSON parsers assume the latest definition takes precedence anyhow.
Stuff starts to get tricky with nested partials/caches. It should all work fine in theory, there's just going to be a bookkeeping overhead.

...

Thoughts? @rwz, would love to know what you or @dhh think. If you're down, I can probably either build this or browbeat someone into doing it.

The text was updated successfully, but these errors were encountered:

dhh · 2015-08-28T09:55:33Z

This seems like a sound strategy to me. I don't think there's anything hacky about a double-pass rendering process. The caveat re duplicate keys is similar to the caveats we have in HTML views with provide/yield not reaching into a cache. So 👍 from me.

On Aug 22, 2015, at 11:12, Vincent Woo [email protected] wrote:

Regarding the issues brought up in #259...

Motivation

We'd like to make caching in JBuilder useful. Currently, caching does not offer significant performance benefits for pretty much any app. This is largely due to the serialization and over-the-wire costs JBuilder pays when storing and fetching cached content, because the current serialization scheme doesnt allow for raw partial content. Below is a proposal for enabling much faster caching while still retaining MultiJson as the serialization backend:

Proposal

When json.cache! cache_key, &block is encountered in a template, Jbuilder emits a placeholder UUID key/value pair into the document (like "jbuilder-XXXX": null) and moves on. It stores the (template digested) cache keys in a list of keys to be looked up, along with their corresponding UUID.

After the template is rendered, the half-baked string is held in memory while JBuilder sends a fetch_multi for each of the keys in the set.

For each key that is found in the cache, the corresponding UUID is blindly replaced with the value taken from the cache. In this case, a simple gsub for "jbuilder-XXXX": null, replacing it with whatever the cache has, like: "color": "red", "count": 2

For each key that is NOT found, execute the block with a fresh json object. Once completed, serialize that json object to a string, and strip off the leading and trailing braces. Store that resulting string under the cache key and perform the substitution operation described in (3). The brace stripping must happen because the json.cache! call does not by default happen under a named key, and can be embedded in the a flow of a template, e.g.

json.key1 'value1'
json.cache! do
json.key2 'value2'
end
produces { "key1": "value1", "key2": "value2"}

Rendering json.key2 'value2' in isolation produces {"key2": "value2"}, which we then strip of braces, which renders it fit to be inserted into the flow of other JSON objects.

Cons

This is a giant fucking hack.
Currently, the following case works fine in that only one entry is emitted:

json.key 'initial value'
json.cache! do
json.key 'new value'
end
But under the described proposal, JBuilder could double-emit entries for key. By and large this will probably not matter - there's no real reason to do this, and most JSON parsers assume the latest definition is takes precedence anyhow.

Stuff starts to get tricky with nested partials/caches. It should all work fine in theory, there's just going to be a bookkeeping overhead.

...

Thoughts? @rwz, would love to know what you or @dhh think. If you're down, I can probably either build this or browbeat someone into doing it.

—
Reply to this email directly or view it on GitHub.

rwz · 2015-08-31T17:18:07Z

I'm skeptical about this approach for various reasons. First, seems it's going to increase code complexity quite a lot to make it work with all possible cases, like calling cache in the middle of rendering an array and such. Also, if you're delaying the execution of cached blocks until after the parent template is fully rendered and jsonifyed, you're losing some context and I'm not sure it's entirely safe and not gonna break some of the exiting workflows.

But even given we're willing to take the risk, it'd be nice to see some benchmarks proving it's actually going to give a significant performance boost. Are you willing to take a stab at this and prepare a branch implementing this change so we can compare the performance against master on some existing apps that rely on cache heavily?

vincentwoo · 2015-08-31T20:39:45Z

I could try to prep a sample app that uses JBuilder caching heavily. As far as I know, no serious app actually uses JBuilder caching because of the performance impact.

Question - is it possible for cache blocks to introspect the state of the json? If so, that's going to be very annoying for compatibility reasons. Otherwise, I don't think losing context will matter, blocks should still work.

tortuetorche · 2015-09-02T12:17:58Z

This is largely due to the serialization and over-the-wire costs JBuilder pays when storing and fetching cached content, because the current serialization scheme doesnt allow for raw partial content.

@vincentwoo Maybe it's possible to change the current behavior to store JBuilder data without serialization? Via the :raw option, like explained in the Rails Guides, it should work with memcache and redis cache stores.

vincentwoo · 2015-09-02T19:31:01Z

You can't. The raw option is for something else (storing pure numbers/strings). Because JBuilder currently works by building a very large in-memory object and only serializing at the end, partial content cannot be stored as raw strings. This proposal aims to allow storing raw partial content.

chirag7jain · 2015-09-05T07:08:21Z

Why can't we just create a hash and store it into our cache store and just feed it to jbuilder.

I am a newbie to rails & jbuilder.. Just ignore it if I am speaking out of context

vincentwoo · 2015-09-05T07:20:52Z

Check out the related issue for an in depth explanation of why you can't: #259

dirk · 2015-10-03T01:19:12Z

I will be implementing this proposal in Everlane's fork.

chirag7jain · 2015-10-03T03:28:08Z

I can collaborate on this if you need help

On Sat, Oct 3, 2015, 06:49 Dirk Gadsden [email protected] wrote:

I will be implementing this proposal in Everlane's fork.

—
Reply to this email directly or view it on GitHub.

This removes the need to serialize-and-then-deserialize-again cached values per the proposal outlined in rails#289.

dirk · 2015-10-06T03:09:01Z

@VincentWo, @rwz, @dhh: This has been implemented in PR #298.

vincentwoo · 2015-10-06T04:13:57Z

I think we'd want to see side-by-sides and measureables. Can you produce a testing repo? Maybe use a stripped-down and scrubbed copy of Everlane's jbuilder structure?

https://github.com/evanphx/benchmark-ips is the standard for Rails perf PRs

dirk · 2015-10-06T17:45:53Z

@vincentwoo: I'll put a side-by-side together today.

dirk · 2015-10-06T21:04:42Z

Here are the results from benchmarking with this script:

$ ruby -v
ruby 2.2.1p85 (2015-02-26 revision 49769) [x86_64-darwin14]

$ git checkout master
$ ruby benchmark.rb
Calculating -------------------------------------
               basic     3.000  i/100ms
-------------------------------------------------
               basic     46.538  (± 4.3%) i/s -    234.000

$ git checkout dirk/caching-improvements
$ ruby benchmark.rb
Calculating -------------------------------------
               basic     9.000  i/100ms
-------------------------------------------------
               basic    112.021  (± 5.4%) i/s -    567.000

chirag7jain · 2015-11-03T12:46:54Z

@vincentwoo @rwz any updates on the pull request...

tcannonfodder · 2016-01-28T15:07:28Z

Improving caching is something I'm definitely interested in, and I'd be more than happy to help with @dirk 's implementation to get it ready to

vincentwoo · 2016-01-28T21:04:35Z

You should probably talk in #298.

If you want to take action now I think you can (in that PR):

review and test the code yourself
add testcases
ping rwz

malomalo · 2017-11-15T07:01:42Z

Not sure if this is still being considered but here are my 2 cents and another solution:

I wouldn't say this is a hack, but handling this will probably bring a plethora of edge cases. I tried it wanting to stream JSON to an IO and what I ended up with is something completely different because of the headaches involved with determining what to cache and how to place it back into the JSON stream not knowing what JSON would be emitting. I eventually ended up with TurboStreamer.

I also don't like double pass rendering generally and would use ESI for that.

That aside if you are still having this issue you might be interested in TurboStreamer.

I've taken @dirk's test from this thread and made my own report. Attached are the results from my machine.

The downside is TurboStreamer is a little more verbose, but I think the benefits outweigh the cons.

We're using it in production for our Rails application and we halved our GC allocations and reduced the number of time the GC runs in our stack. I'm not exactly sure why in the test TurboStreamer though, but perhaps is the way the test was structured.

Tagging @mad-raz incase you still have this issue with #259.

PikachuEXE · 2017-11-24T06:22:01Z

@malomalo
Does TurboStreamer support OJ?

malomalo · 2017-11-25T01:35:58Z

@PikachuEXE Not currently, the only encoder currently supported is Wankel

Wankel uses YAJL. I had to make Wankel to get access to the streaming functionality provided by YAJL. I didn't know of OJ until after I made Wankel.

However it's pretty easy to make an encoder (here's wankel's for example). You can read more about it in the README.

It might be possible to make an OJ adapter without any need from their end, but I'm not sure yet. If you have any ideas about how to implement the capture(&block) function with OJ I'd be open to adding the adapter. Currently I believe getting it to work with OJ would require some help from them.

The capture(&block) method is required to support caching.

dirk added a commit to Everlane/jbuilder that referenced this issue Oct 6, 2015

Implement double-serialization-free caching system

9d76862

This removes the need to serialize-and-then-deserialize-again cached values per the proposal outlined in rails#289.

dirk mentioned this issue Oct 6, 2015

Caching improvements #298

Open

jho406 mentioned this issue May 28, 2017

Add support for multifetch in BreezyTemplates thoughtbot/superglue#7

Closed

malomalo mentioned this issue Nov 30, 2017

[WIP] Implement OjEncoder malomalo/turbostreamer#1

Merged

malomalo mentioned this issue Apr 21, 2020

Caching is working but slower #259

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposal for caching improvements #289

Proposal for caching improvements #289

vincentwoo commented Aug 22, 2015

dhh commented Aug 28, 2015

rwz commented Aug 31, 2015

vincentwoo commented Aug 31, 2015

tortuetorche commented Sep 2, 2015

vincentwoo commented Sep 2, 2015

chirag7jain commented Sep 5, 2015

vincentwoo commented Sep 5, 2015

dirk commented Oct 3, 2015

chirag7jain commented Oct 3, 2015

dirk commented Oct 6, 2015

vincentwoo commented Oct 6, 2015

dirk commented Oct 6, 2015

dirk commented Oct 6, 2015

chirag7jain commented Nov 3, 2015

tcannonfodder commented Jan 28, 2016

vincentwoo commented Jan 28, 2016

malomalo commented Nov 15, 2017

PikachuEXE commented Nov 24, 2017

malomalo commented Nov 25, 2017

Proposal for caching improvements #289

Proposal for caching improvements #289

Comments

vincentwoo commented Aug 22, 2015

Motivation

Proposal

Cons

...

dhh commented Aug 28, 2015

rwz commented Aug 31, 2015

vincentwoo commented Aug 31, 2015

tortuetorche commented Sep 2, 2015

vincentwoo commented Sep 2, 2015

chirag7jain commented Sep 5, 2015

vincentwoo commented Sep 5, 2015

dirk commented Oct 3, 2015

chirag7jain commented Oct 3, 2015

dirk commented Oct 6, 2015

vincentwoo commented Oct 6, 2015

dirk commented Oct 6, 2015

dirk commented Oct 6, 2015

chirag7jain commented Nov 3, 2015

tcannonfodder commented Jan 28, 2016

vincentwoo commented Jan 28, 2016

malomalo commented Nov 15, 2017

PikachuEXE commented Nov 24, 2017

malomalo commented Nov 25, 2017