Function to compute optimal ecmult_multi scratch size for a number of points #638

jonasnick · 2019-06-12T21:34:48Z

@DavidBurkett requested to allow computing the optimal scratch size for Schnorr batch verification (BlockstreamResearch/secp256k1-zkp#69). This PR is a prerequisite but also contains a bunch of other fixups.

Other than adding the new function this PR refactors scratch space handling in ecmult_impl to improve code quality, tests and documentation.

The biggest part of this PR is to make computing the scratch size and its inverse more precise by not assuming maximum padding when aligning, but rather using the actual padding. This is not strictly necessary but removes a leaky abstraction and makes testing easier.

real-or-random

(My review is best viewed commit by commit.)

real-or-random · 2019-06-13T12:44:24Z

src/tests.c

@@ -3094,7 +3094,7 @@ void test_ecmult_multi_batching(void) {
    secp256k1_scratch_destroy(&ctx->error_callback, scratch);

    for(i = 1; i <= n_points; i++) {
-        if (i > ECMULT_PIPPENGER_THRESHOLD) {
+        if (i >= ECMULT_PIPPENGER_THRESHOLD) {


real-or-random · 2019-06-13T12:44:35Z

src/ecmult_impl.h

-    state_space->wnaf_na = (int *) secp256k1_scratch_alloc(error_callback, scratch, entries*(WNAF_SIZE(bucket_window+1)) * sizeof(int));
-    buckets = (secp256k1_gej *) secp256k1_scratch_alloc(error_callback, scratch, (1<<bucket_window) * sizeof(*buckets));
+    state_space->wnaf_na = (int *) secp256k1_scratch_alloc(error_callback, scratch, entries * WNAF_SIZE(bucket_window+1) * sizeof(int));
+    buckets = (secp256k1_gej *) secp256k1_scratch_alloc(error_callback, scratch, sizeof(*buckets) << bucket_window);


real-or-random · 2019-06-13T12:44:46Z

src/util.h

@@ -93,7 +93,7 @@ static SECP256K1_INLINE void *checked_realloc(const secp256k1_callback* cb, void
 #define ALIGNMENT 16
 #endif

-#define ROUND_TO_ALIGN(size) (((size + ALIGNMENT - 1) / ALIGNMENT) * ALIGNMENT)
+#define ROUND_TO_ALIGN(size) ((((size) + ALIGNMENT - 1) / ALIGNMENT) * ALIGNMENT)


real-or-random · 2019-06-13T12:49:23Z

src/ecmult_impl.h

+        return secp256k1_pippenger_scratch_size(n_points, bucket_window);
+    } else {
+        return secp256k1_strauss_scratch_size(n_points);
+    }


Approach ACK

What's an approach ACK?

Oh, I guess it's a concept ack. I was confused by other meanings of approach :D

I'm actively testing bitcoin/bitcoin#16149 here

edit: except that I just write "ACK". All my "ACK"s in this review mean "ACK thorough code inspection"

(No action needed) Responding to this 2 year old comment 😅

In Core, it looks like "Approach ACK" means "Concept ACK, and I agree with the approach of this change (but I haven't reviewed the code in detail)":

https://github.com/bitcoin/bitcoin/blob/master/CONTRIBUTING.md#conceptual-review

real-or-random · 2019-06-13T12:55:11Z

src/ecmult_impl.h

+    (*state_space)->wnaf_na = (int *) secp256k1_scratch_alloc(error_callback, scratch, entries * WNAF_SIZE(bucket_window+1) * sizeof(int));
+    *buckets = (secp256k1_gej *) secp256k1_scratch_alloc(error_callback, scratch, sizeof(secp256k1_gej) << bucket_window);
+    if ((*state_space)->ps == NULL || (*state_space)->wnaf_na == NULL || *buckets == NULL) {
+        secp256k1_scratch_apply_checkpoint(error_callback, scratch, scratch_checkpoint);


Approach ACK
I think it's cleaner to apply the checkpoint outside this function.

real-or-random · 2019-06-13T13:03:43Z

src/ecmult_impl.h

 #endif
-    return n_points*point_size;
+    size += ROUND_TO_ALIGN(n_points * sizeof(struct secp256k1_strauss_point_state));
+    return size;


Hm, I'm somewhat unsure about this. It seems like a layer violation to care about the alignment here.

If we want to have a function that returns the required scratch space given a number of points (and we should) then we need to add the padding somewhere. While we can assume the worst case padding somewhere else I would prefer to have the *_scratch_space function return exact results. This makes it much easier to think about and also helps testing because now we can just check that what is allocated actually matches what we computed with *_scratch_space (see 24553bf#diff-4655d106bf03045a3a50beefc800db21R2996). Or do you have an alternative in mind?

Added function alloc_size to scratch space

real-or-random · 2019-06-13T13:09:07Z

src/ecmult_impl.h

-    return secp256k1_scratch_max_allocation(error_callback, scratch, STRAUSS_SCRATCH_OBJECTS) / secp256k1_strauss_scratch_size(1);
+    /* Call max_allocation with 0 objects because we've already accounted for
+     * alignment in strauss_scratch_size. */
+    return secp256k1_scratch_max_allocation(error_callback, scratch, 0) / secp256k1_strauss_scratch_size(1);


... and wasn't the previous version (without changes in strauss_scratch_size) more precise?
We need to round up to the alignment only once per array (e.g., once for the scalars array).

In the proposed revision, I think we overestimate the required padding a lot because we still call strauss_scratch_size(1) here but this has padding now.

This is correct - sorry I overlooked this. Will add fix and test.

real-or-random · 2019-06-13T16:25:31Z

src/ecmult_impl.h

+             * account it suffices to decrease n_points by one. This is because
+             * the maximum padding required is less than an entry. */
+            n_points -= 1;
+            VERIFY_CHECK(space_for_points >= secp256k1_pippenger_scratch_size_points(n_points, bucket_window, 1));


After some discussion with @jonasnick, one of the things I'm not sure about is the added complexity in this function.
On the one hand, this function is accurate now and users of the function can rely on that.
On the other hand, if we just call secp256k1_scratch_max_allocation with PIPPENGER_SCRATCH_OBJECTS instead of 0, we may return a value that is one too small. That's not terrible for performance but it potentially makes the function a little bit harder to use and test because you may need to remember that it is not accurate.

It seems like both hands are arguments in favor of the change (i.e. calling secp256k1_scratch_max_allocation with 0).

We need to do that for strauss anyway because otherwise

n_points == strauss_max_points(..., scratch_create(strauss_scratch_size(n_points)))

wouldn't hold.

jonasnick · 2019-06-16T19:49:22Z

I made a couple of changes and in order to avoid adding code that is deleted in later commits I force pushed, sorry. Summary of the changes:

added function alloc_size to scratch space to compute actual size allocated for a given number of objects
fixed bug in strauss_max_points (thanks @real-or-random) that vastly underestimated the number of points actually fitting into the scratch space. Also added test which would have caught this issue.
added a verify check to ensure that the space required for a single point/entry is smaller than the worst case padding

sipa · 2019-07-23T19:48:55Z

src/ecmult_impl.h


-        n_points = space_for_points/entry_size;
+        n_points = (space_for_points - entry_size)/entry_size;


Is this right? It's equivalent to space_for_points / entry_size - 1.

It is right. Simplified the line according to your suggestion.

Is this assignment to n_points (along with the comment) redundant with above?

sipa · 2019-07-23T19:55:25Z

Concept ACK, I still need to go over the logic changes.

real-or-random · 2021-04-07T12:18:55Z

needs rebase

jonasnick · 2021-11-06T20:48:29Z

Rebased and polished quite a bit. Also added fix for bug in master that we noticed before iirc. So to make sure it gets in I opened #1004.

Still, I didn't fully try to understand how this PR works. Also, it seems like ecmult_multi_scratch_size doesn't give the exact optimal result. That's because a scratch space of size pippenger_scratch_size(n_points, bucket_window) it may happen that strauss_max_points(error_callback, scratch), n) (the actual batch size) is smaller than n_points.

robot-dreams

Concept ACK

robot-dreams · 2021-11-29T01:40:42Z

src/scratch_impl.h

+    size_t sum = 0;
+
+    for (i = 0; i < n_sizes; i++) {
+        sum += ROUND_TO_ALIGN(sizes[i]);


Nit: The existing secp256k1_scratch_max_allocation seems very careful about checking for overflow. For consistency, is it necessary to do the same here? For example:

// Check for overflow if (sum + ROUND_TO_ALIGN(sizes[i]) < sum) { return 0; }

I think I fixed this. Also added test.

src/ecmult_impl.h

robot-dreams · 2021-11-29T02:39:33Z

src/ecmult_impl.h


-        n_points = space_for_points/entry_size;
+        n_points = (space_for_points - entry_size)/entry_size;


Is this assignment to n_points (along with the comment) redundant with above?

robot-dreams · 2021-11-29T02:43:25Z

src/ecmult_impl.h

- * Returns the maximum number of points in addition to G that can be used with
- * a given scratch space. The function ensures that fewer points may also be
- * used.
+/* Returns the (near) maximum number of points in addition to G that can be


~~Do you already know how it might fail to be a maximum? (No worries if not, I still want to revisit these details carefully.)~~

Edit: Could this fail to be a maximum because the constant space used by the buckets decreases when you jump to the next bucket window size?

robot-dreams

Looks good overall.

It's unfortunate that:

Padding / alignment adds so much complexity to these calculations
Pippenger sizes have this weird non-monotonic behavior

But I still think your change makes sense.

My only general feedback is that updating the scratch space usage would involve keeping a lot of different things in sync, similar to what @real-or-random mentioned at #1004 (comment). Is there a way to refactor or add comments to make the task easier in the future (e.g. by sharing code between scratch_size_raw and batch_allocate)?

robot-dreams · 2021-11-29T23:12:25Z

src/ecmult_impl.h

 }

 static int secp256k1_ecmult_pippenger_batch(const secp256k1_callback* error_callback, secp256k1_scratch *scratch, secp256k1_gej *r, const secp256k1_scalar *inp_g_sc, secp256k1_ecmult_multi_callback cb, void *cbdata, size_t n_points, size_t cb_offset) {
    const size_t scratch_checkpoint = secp256k1_scratch_checkpoint(error_callback, scratch);
    /* Use 2(n+1) with the endomorphism, when calculating batch
     * sizes. The reason for +1 is that we add the G scalar to the list of
     * other scalars. */
-    size_t entries = 2*n_points + 2;
+    size_t entries = secp256k1_pippenger_entries(n_points);


Nit: Should the comment go above the definition of the function secp256k1_pippenger_entries instead?

robot-dreams · 2021-11-29T23:20:20Z

src/ecmult_impl.h

-        space_overhead = (sizeof(secp256k1_gej) << bucket_window) + entry_size + sizeof(struct secp256k1_pippenger_state);
-        if (space_overhead > max_alloc) {
+        space_constant = secp256k1_pippenger_scratch_size_constant(bucket_window);
+        if (space_constant + entry_size > max_alloc) {


Style nit (feel free to ignore): Was this check previously here for avoiding underflow (rather than short-circuiting)? If so would it make sense to keep the check as space_constant > max_alloc to make the intent clear?

robot-dreams · 2021-11-30T00:05:14Z

src/tests.c

+void test_ecmult_multi_strauss_max_points(void) {
+    size_t scratch_size = secp256k1_strauss_scratch_size_raw(1, 0);
+    size_t max_scratch_size = secp256k1_strauss_scratch_size_raw(1, 1) + 1;
+    for (; scratch_size < max_scratch_size; scratch_size++) {


Would it make sense to check bigger scratch_size (so that n_points is bigger too), but increase the amount scratch_size is incremented on each iteration?

robot-dreams · 2021-11-30T00:06:52Z

src/tests.c

+        secp256k1_scratch *scratch = secp256k1_scratch_create(&ctx->error_callback, scratch_size);
+        size_t n_points = secp256k1_strauss_max_points(&ctx->error_callback, scratch);
+        CHECK(secp256k1_scratch_max_allocation(&ctx->error_callback, scratch, 0) == scratch_size);
+        CHECK(scratch_size >= secp256k1_strauss_scratch_size(n_points));


Would it make sense to also check that the result is exact, e.g. by adding a check like this:

CHECK(scratch_size < secp256k1_strauss_scratch_size(n_points + 1));

robot-dreams · 2021-11-30T00:12:30Z

src/ecmult_impl.h

+        return secp256k1_pippenger_scratch_size(n_points, bucket_window);
+    } else {
+        return secp256k1_strauss_scratch_size(n_points);
+    }


(No action needed) Responding to this 2 year old comment 😅

In Core, it looks like "Approach ACK" means "Concept ACK, and I agree with the approach of this change (but I haven't reviewed the code in detail)":

https://github.com/bitcoin/bitcoin/blob/master/CONTRIBUTING.md#conceptual-review

robot-dreams · 2021-11-30T01:18:45Z

src/ecmult_impl.h

-        size_t entry_size = sizeof(secp256k1_ge) + sizeof(secp256k1_scalar) + sizeof(struct secp256k1_pippenger_point_state) + (WNAF_SIZE(bucket_window+1)+1)*sizeof(int);
+        size_t space_constant;
+        /* Compute entry size without taking alignment into account */
+        size_t entry_size = secp256k1_pippenger_scratch_size_points(0, bucket_window, 0);


Style nit (feel free to ignore): Should this be called point_size instead to avoid confusion, since in other places you get 2(n+1) entries from the endomorphism?

Take actual alignment into account instead of assuming worst case. This improves test because it can be checked that *_scratch_size matches what is actually allocated.

Take actual alignment into account instead of assuming the worst case. This allows more precise tests for strauss, because if a scratch space has exactly strauss_scratch_size(n_points) left, then secp256k1_strauss_max_points(cb, scratch) = n_points.

jonasnick · 2022-01-30T17:11:06Z

I rebased this to see how master affects this PR. Will still need to address review comments and add better explanations to the commits.

jonasnick force-pushed the ecmult-scratch branch from 36f2e13 to e774a6e Compare June 13, 2019 07:10

real-or-random reviewed Jun 13, 2019

View reviewed changes

jonasnick force-pushed the ecmult-scratch branch 2 times, most recently from 872a17b to cbe3cc7 Compare June 16, 2019 19:45

jonasnick force-pushed the ecmult-scratch branch from cbe3cc7 to 83fb087 Compare June 16, 2019 19:55

sipa reviewed Jul 23, 2019

View reviewed changes

jonasnick mentioned this pull request Oct 25, 2021

Replace MuSig(1) module with MuSig2 BlockstreamResearch/secp256k1-zkp#131

Merged

8 tasks

jonasnick force-pushed the ecmult-scratch branch from 85805b1 to 8e68782 Compare November 6, 2021 20:41

jonasnick force-pushed the ecmult-scratch branch from 8e68782 to e54c1af Compare November 6, 2021 20:58

robot-dreams reviewed Nov 29, 2021

View reviewed changes

robot-dreams reviewed Nov 30, 2021

View reviewed changes

jonasnick added 6 commits January 28, 2022 23:04

ecmult: fix off-by-one in ecmult_multi test

8040c1f

ecmult: compute allocated size for pippenger buckets consistently

4c5ac49

scratch: add alloc_size function to scratch_space

8a29272

ecmult: make strauss_ and pippenger_scratch_size more precise

4d78e9b

Take actual alignment into account instead of assuming worst case. This improves test because it can be checked that *_scratch_size matches what is actually allocated.

ecmult: add function to compute optimal scratch space size

0d574b4

jonasnick force-pushed the ecmult-scratch branch from e54c1af to 0d574b4 Compare January 30, 2022 17:10


		n_points = space_for_points/entry_size;
		n_points = (space_for_points - entry_size)/entry_size;

Function to compute optimal ecmult_multi scratch size for a number of points #638

Are you sure you want to change the base?

Function to compute optimal ecmult_multi scratch size for a number of points #638

Conversation

jonasnick commented Jun 12, 2019 • edited Loading

real-or-random left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

real-or-random Jun 13, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jonasnick Jun 13, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jonasnick Jun 15, 2019 • edited Loading

Choose a reason for hiding this comment

jonasnick commented Jun 16, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sipa commented Jul 23, 2019

real-or-random commented Apr 7, 2021

jonasnick commented Nov 6, 2021

robot-dreams left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

robot-dreams Nov 29, 2021 • edited Loading

Choose a reason for hiding this comment

robot-dreams left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jonasnick commented Jan 30, 2022

jonasnick commented Jun 12, 2019 •

edited

Loading

real-or-random Jun 13, 2019 •

edited

Loading

jonasnick Jun 13, 2019 •

edited

Loading

jonasnick Jun 15, 2019 •

edited

Loading

robot-dreams Nov 29, 2021 •

edited

Loading