Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding multiple buckets with the same ID causes CPU to be pegged #13

Closed
glittershark opened this issue Oct 18, 2020 · 12 comments
Closed

Comments

@glittershark
Copy link
Contributor

At least I think that's what's happening - it looks like since superlifter assocs in the bucket directly with no regard to what's already there and then only stops the buckets it knows about on shutdown any overwritten buckets just loop forever. This is relatively easy to hit by running any three-level request, since all the buckets in the second level just overwrite each other. What I'd love to see is some sort of intelligent merging of bucket options, to eg sum together thresholds or take the max of all intervals.

@glittershark
Copy link
Contributor Author

Here's a sketch somewhat like what I'm thinking of. Note that since this is nesting swaps this should almost certainly replace the :queue atoms and the :buckets atom with Refs+STM.

(defn stop-bucket! [bucket]
  (doseq [{:keys [stop-fn]} (vals (:triggers bucket))
          :when stop-fn]
    (stop-fn)))

(defmulti merge-triggers (fn [_bucket kind _trigger1 _trigger2]
                           kind))

(defmethod merge-triggers :default [_ _ _ trigger2] trigger2)

(defmethod merge-triggers :queue-size [_ _ trigger1 trigger2]
  (update trigger1 :threshold + (:threshold trigger2)))

(defn merge-buckets [bucket1 bucket2]
  (when (:queue bucket2)
    (swap! (:queue bucket1) into @(:queue bucket2)))
  (update bucket1 :triggers
          (fn [triggers]
            (reduce-kv
             (fn [ts kind trigger]
               (if (contains? ts kind)
                 (update ts kind #(merge-triggers kind % trigger))
                 (assoc ts kind trigger)))
             triggers
             (:triggers bucket2)))))

(defn add-bucket! [ctx id opts]
  (swap!
   (:buckets ctx)
   (fn [buckets]
     (update buckets id
             (fn [bucket]
               (if bucket
                 (do
                   (stop-bucket! bucket)
                   (merge-buckets bucket opts))
                 (#'sl.core/start-bucket! ctx id opts))))))
  ctx)

@oliyh
Copy link
Owner

oliyh commented Oct 19, 2020

Hi,

Thank you for this very detailed issue. My original intention for buckets is for them to either be perpetual and constant (i.e. something like fetch every 100ms) or to be created for a specific purpose (this node has six children, create a sized bucket for fetching those six). If I understand your particular use case, you have a list of nodes n * A each of which have multiple children m * B and you want to be able to fetch n * m * B in one go, so you need each resolver of A to be able to contribute its count of m to the B bucket. You would need to be sure that your resolver framework fetches breadth first, not depth, for this to work.

In this case I would consider adding a new function called update-bucket which would work rather like the core update function, giving you the existing bucket description and allowing you to alter the description of it. I feel this would give finer control over what is going on, although it probably is worth pointing out in the readme that if you have resolvers that re-use or share buckets, particularly size triggered ones, then you will need to use update-bucket.

What do you think?

@glittershark
Copy link
Contributor Author

You would need to be sure that your resolver framework fetches breadth first, not depth, for this to work

yeah I'll have to look into what Lacinia does - maybe I'll send them a patch if they don't.

In this case I would consider adding a new function called update-bucket which would work rather like the core update function, giving you the existing bucket description and allowing you to alter the description of it

The individual resolver doesn't know if it has any siblings, though - so in that case it wouldn't know whether to call update or add. Maybe the update semantics could nil-pun.

although it probably is worth pointing out in the readme that if you have resolvers that re-use or share buckets, particularly size triggered ones, then you will need to use update-bucket

I still think there's an underlying bug here, which is that buckets that get (potentially accidentally) overwritten never get their triggers stopped. I think separate to the update-bucket stuff that should be turned into a warning but with the bucket stopped. I say warning rather than error because again, different queries (supplied by the client!) can trigger different behavior in terms of whether resolvers have siblings or not, and I think it'd be confusing (and potentially disastrous in production!) to have one query work fine and another query return an error.

glittershark added a commit to glittershark/superlifter that referenced this issue Oct 19, 2020
When adding a bucket with an ID that's already used, currently the
previous bucket is just silently overwritten. This causes issues if the
bucket has for example an interval trigger, since that continues to run
unchecked indefinitely, which was the original issue described in oliyh#13
This change stops any overwritten bucket and logs a warning.
@oliyh
Copy link
Owner

oliyh commented Oct 19, 2020

Agree that siblings aren't aware of each other so an update/upsert behaviour would create the bucket if it didn't already exist, but I would like the developer to make the decision on how to merge buckets. Superlifter could of course provide some common strategies, such as you have suggested above.

Lacinia does do breadth-first resolution and I'm fairly certain that most other graphql implementations do the same but I thought it worth noting as superlifter can be used in any context, not just graphql.

glittershark added a commit to glittershark/superlifter that referenced this issue Oct 19, 2020
When adding a bucket with an ID that's already used, currently the
previous bucket is just silently overwritten. This causes issues if the
bucket has for example an interval trigger, since that continues to run
unchecked indefinitely, which was the original issue described in oliyh#13
This change stops any overwritten bucket and logs a warning.
oliyh pushed a commit that referenced this issue Oct 20, 2020
When adding a bucket with an ID that's already used, currently the
previous bucket is just silently overwritten. This causes issues if the
bucket has for example an interval trigger, since that continues to run
unchecked indefinitely, which was the original issue described in #13
This change stops any overwritten bucket and logs a warning.
@oliyh
Copy link
Owner

oliyh commented Nov 12, 2020

Hi @glittershark

I have rewritten quite a lot of superlifter to avoid the nested atoms (there's only one at the top level now for buckets) and to also ensure that enqueuing and fetching are atomic. I have changed the implementation of add-bucket to now effectively do nothing if the bucket already exists. Could you try 0.1.3-SNAPSHOT and let me know if it works for your use case?

One of the bugs was that if you have a bucket of size 3, you need to ensure that all 3 muses destined for it end up there otherwise it will never trigger. This means you can't do a partial fetch early when there is say 1 muse because its 2 siblings will be added later and never reach the threshold of 3, which is why I changed add-bucket to not stop/do a partial fetch when overwritten. I think perhaps add-bucket should not accept an id, but instead return one, and the user should ensure that this id is used in the appropriate place. That way we would never get cross-talk on the buckets, which is what has caused these problems I believe.

Let me know what you think?

Cheers,

@oliyh
Copy link
Owner

oliyh commented Nov 16, 2020

See #12 (comment)

@glittershark
Copy link
Contributor Author

I'll hopefully get a chance to look at this this weekend.

@glittershark
Copy link
Contributor Author

@oliyh did something about the wrapping API change? because now my tests are failing with "Field resolver returned a single value, expected a collection of values."

@glittershark
Copy link
Contributor Author

glittershark commented Nov 28, 2020

@oliyh did something about the wrapping API change? because now my tests are failing with "Field resolver returned a single value, expected a collection of values."

This appears to be caused by resolve/with-context - reporting that elsewhere (#19)

@oliyh
Copy link
Owner

oliyh commented Nov 28, 2020

Hello,

Nothing changed intentionally, I will add with-context to the example project and work out what has happened.

Regarding your NPE you need to ensure the bucket exists first before calling update trigger - either by making it part of your initial superlifter options (preferred, and how the example is now configured) or by calling add-bucket! first. (But I think you've already figured that out)

@glittershark
Copy link
Contributor Author

The NPE was just due to a missing with-superlifter

@glittershark
Copy link
Contributor Author

Feel okay with closing this for now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants