Adding multiple buckets with the same ID causes CPU to be pegged #13

glittershark · 2020-10-18T18:03:24Z

At least I think that's what's happening - it looks like since superlifter assocs in the bucket directly with no regard to what's already there and then only stops the buckets it knows about on shutdown any overwritten buckets just loop forever. This is relatively easy to hit by running any three-level request, since all the buckets in the second level just overwrite each other. What I'd love to see is some sort of intelligent merging of bucket options, to eg sum together thresholds or take the max of all intervals.

The text was updated successfully, but these errors were encountered:

glittershark · 2020-10-18T18:23:05Z

Here's a sketch somewhat like what I'm thinking of. Note that since this is nesting swaps this should almost certainly replace the :queue atoms and the :buckets atom with Refs+STM.

(defn stop-bucket! [bucket]
  (doseq [{:keys [stop-fn]} (vals (:triggers bucket))
          :when stop-fn]
    (stop-fn)))

(defmulti merge-triggers (fn [_bucket kind _trigger1 _trigger2]
                           kind))

(defmethod merge-triggers :default [_ _ _ trigger2] trigger2)

(defmethod merge-triggers :queue-size [_ _ trigger1 trigger2]
  (update trigger1 :threshold + (:threshold trigger2)))

(defn merge-buckets [bucket1 bucket2]
  (when (:queue bucket2)
    (swap! (:queue bucket1) into @(:queue bucket2)))
  (update bucket1 :triggers
          (fn [triggers]
            (reduce-kv
             (fn [ts kind trigger]
               (if (contains? ts kind)
                 (update ts kind #(merge-triggers kind % trigger))
                 (assoc ts kind trigger)))
             triggers
             (:triggers bucket2)))))

(defn add-bucket! [ctx id opts]
  (swap!
   (:buckets ctx)
   (fn [buckets]
     (update buckets id
             (fn [bucket]
               (if bucket
                 (do
                   (stop-bucket! bucket)
                   (merge-buckets bucket opts))
                 (#'sl.core/start-bucket! ctx id opts))))))
  ctx)

oliyh · 2020-10-19T08:19:07Z

Hi,

Thank you for this very detailed issue. My original intention for buckets is for them to either be perpetual and constant (i.e. something like fetch every 100ms) or to be created for a specific purpose (this node has six children, create a sized bucket for fetching those six). If I understand your particular use case, you have a list of nodes n * A each of which have multiple children m * B and you want to be able to fetch n * m * B in one go, so you need each resolver of A to be able to contribute its count of m to the B bucket. You would need to be sure that your resolver framework fetches breadth first, not depth, for this to work.

In this case I would consider adding a new function called update-bucket which would work rather like the core update function, giving you the existing bucket description and allowing you to alter the description of it. I feel this would give finer control over what is going on, although it probably is worth pointing out in the readme that if you have resolvers that re-use or share buckets, particularly size triggered ones, then you will need to use update-bucket.

What do you think?

glittershark · 2020-10-19T18:12:43Z

You would need to be sure that your resolver framework fetches breadth first, not depth, for this to work

yeah I'll have to look into what Lacinia does - maybe I'll send them a patch if they don't.

In this case I would consider adding a new function called update-bucket which would work rather like the core update function, giving you the existing bucket description and allowing you to alter the description of it

The individual resolver doesn't know if it has any siblings, though - so in that case it wouldn't know whether to call update or add. Maybe the update semantics could nil-pun.

although it probably is worth pointing out in the readme that if you have resolvers that re-use or share buckets, particularly size triggered ones, then you will need to use update-bucket

I still think there's an underlying bug here, which is that buckets that get (potentially accidentally) overwritten never get their triggers stopped. I think separate to the update-bucket stuff that should be turned into a warning but with the bucket stopped. I say warning rather than error because again, different queries (supplied by the client!) can trigger different behavior in terms of whether resolvers have siblings or not, and I think it'd be confusing (and potentially disastrous in production!) to have one query work fine and another query return an error.

When adding a bucket with an ID that's already used, currently the previous bucket is just silently overwritten. This causes issues if the bucket has for example an interval trigger, since that continues to run unchecked indefinitely, which was the original issue described in oliyh#13 This change stops any overwritten bucket and logs a warning.

oliyh · 2020-10-19T19:18:09Z

Agree that siblings aren't aware of each other so an update/upsert behaviour would create the bucket if it didn't already exist, but I would like the developer to make the decision on how to merge buckets. Superlifter could of course provide some common strategies, such as you have suggested above.

Lacinia does do breadth-first resolution and I'm fairly certain that most other graphql implementations do the same but I thought it worth noting as superlifter can be used in any context, not just graphql.

When adding a bucket with an ID that's already used, currently the previous bucket is just silently overwritten. This causes issues if the bucket has for example an interval trigger, since that continues to run unchecked indefinitely, which was the original issue described in oliyh#13 This change stops any overwritten bucket and logs a warning.

When adding a bucket with an ID that's already used, currently the previous bucket is just silently overwritten. This causes issues if the bucket has for example an interval trigger, since that continues to run unchecked indefinitely, which was the original issue described in #13 This change stops any overwritten bucket and logs a warning.

oliyh · 2020-11-12T16:14:18Z

Hi @glittershark

I have rewritten quite a lot of superlifter to avoid the nested atoms (there's only one at the top level now for buckets) and to also ensure that enqueuing and fetching are atomic. I have changed the implementation of add-bucket to now effectively do nothing if the bucket already exists. Could you try 0.1.3-SNAPSHOT and let me know if it works for your use case?

One of the bugs was that if you have a bucket of size 3, you need to ensure that all 3 muses destined for it end up there otherwise it will never trigger. This means you can't do a partial fetch early when there is say 1 muse because its 2 siblings will be added later and never reach the threshold of 3, which is why I changed add-bucket to not stop/do a partial fetch when overwritten. I think perhaps add-bucket should not accept an id, but instead return one, and the user should ensure that this id is used in the appropriate place. That way we would never get cross-talk on the buckets, which is what has caused these problems I believe.

Let me know what you think?

Cheers,

oliyh · 2020-11-16T15:54:35Z

See #12 (comment)

glittershark · 2020-11-19T17:19:56Z

I'll hopefully get a chance to look at this this weekend.

glittershark · 2020-11-28T17:28:40Z

@oliyh did something about the wrapping API change? because now my tests are failing with "Field resolver returned a single value, expected a collection of values."

glittershark · 2020-11-28T17:34:38Z

@oliyh did something about the wrapping API change? because now my tests are failing with "Field resolver returned a single value, expected a collection of values."

This appears to be caused by resolve/with-context - reporting that elsewhere (#19)

oliyh · 2020-11-28T18:59:08Z

Hello,

Nothing changed intentionally, I will add with-context to the example project and work out what has happened.

Regarding your NPE you need to ensure the bucket exists first before calling update trigger - either by making it part of your initial superlifter options (preferred, and how the example is now configured) or by calling add-bucket! first. (But I think you've already figured that out)

glittershark · 2020-11-28T20:33:11Z

The NPE was just due to a missing with-superlifter

glittershark · 2020-11-29T00:28:36Z

Feel okay with closing this for now

glittershark mentioned this issue Oct 19, 2020

Warn and stop buckets when overwritten #15

Merged

glittershark mentioned this issue Oct 19, 2020

Flush all buckets, regardless of trigger (eg after a request) #14

Closed

oliyh mentioned this issue Oct 23, 2020

Lacinia hangs on parallel batching and non-batching queries #12

Closed

oliyh mentioned this issue Nov 25, 2020

Fix hanging using queue-size buckets by introducing elastic bucket behaviour #17

Merged

glittershark closed this as completed Nov 29, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding multiple buckets with the same ID causes CPU to be pegged #13

Adding multiple buckets with the same ID causes CPU to be pegged #13

glittershark commented Oct 18, 2020

glittershark commented Oct 18, 2020

oliyh commented Oct 19, 2020

glittershark commented Oct 19, 2020

oliyh commented Oct 19, 2020

oliyh commented Nov 12, 2020

oliyh commented Nov 16, 2020

glittershark commented Nov 19, 2020

glittershark commented Nov 28, 2020

glittershark commented Nov 28, 2020 •

edited

Loading

oliyh commented Nov 28, 2020

glittershark commented Nov 28, 2020

glittershark commented Nov 29, 2020

Adding multiple buckets with the same ID causes CPU to be pegged #13

Adding multiple buckets with the same ID causes CPU to be pegged #13

Comments

glittershark commented Oct 18, 2020

glittershark commented Oct 18, 2020

oliyh commented Oct 19, 2020

glittershark commented Oct 19, 2020

oliyh commented Oct 19, 2020

oliyh commented Nov 12, 2020

oliyh commented Nov 16, 2020

glittershark commented Nov 19, 2020

glittershark commented Nov 28, 2020

glittershark commented Nov 28, 2020 • edited Loading

oliyh commented Nov 28, 2020

glittershark commented Nov 28, 2020

glittershark commented Nov 29, 2020

glittershark commented Nov 28, 2020 •

edited

Loading