Improve dudect test with cropped time analysis #283

BennyWang1007 · 2025-03-22T17:45:54Z

The original dudect collect all execution times and perform t-tests, which may be affected by outliers. The outliers could be caused by context switches, interrupts, or other system activities. This patch introduces percentile-based cropping to remove outliers.

The patch adds a new function "prepare_percentiles()" to compute thresholds using an complementary exponential decay scale. The function is called before the test starts.

The patch modifies "update_statistics()" to perform t-tests on cropped execution times by filtering out the outliers.

Andrushika · 2025-03-23T04:56:22Z

This implementation seems quite different from the original one in dudect. In the original version, they maintain the time data after cropping in different percentile scales as other data sets (refer to original update_statistics()). For example, if there are 100 different crop scales, it will have 101 (100+1 raw data that didn't crop) arrays storing data under different scales, and extract the max t-value in these 101 arrays.
But the implementation in this commit adds all data into the "same array". It has just weakened the influence of outliers instead of excluding them, by adding the outliers relatively few times; it seems quite different from the original implementation in dudect.

BennyWang1007 · 2025-03-23T14:04:10Z

This implementation seems quite different from the original one in dudect. In the original version, they maintain the time data after cropping in different percentile scales as other data sets (refer to original update_statistics()). For example, if there are 100 different crop scales, it will have 101 (100+1 raw data that didn't crop) arrays storing data under different scales, and extract the max t-value in these 101 arrays. But the implementation in this commit adds all data into the "same array". It has just weakened the influence of outliers instead of excluding them, by adding the outliers relatively few times; it seems quite different from the original implementation in dudect.

Thank you for your feedback! You’re right that the previous implementation did not fully align with the original approach in dudect. Instead of maintaining separate data sets for different percentile scales, it weakened the influence of outliers rather than excluding them.

I’ve now updated the code to correctly implement NUM_PERCENTILES + 1 independent tests, ensuring that each cropping scale maintains its own dataset. Please review the updated version, and let me know if you have any further suggestions. Thanks!

dudect/fixture.c

jserv · 2025-03-25T00:33:18Z

Consider the changes below:

--- a/dudect/fixture.c
+++ b/dudect/fixture.c
@@ -67,11 +67,11 @@ static int64_t percentile(const int64_t *a_sorted, double which, size_t size)
     return a_sorted[pos];
 }
 
-static int cmp(const int64_t *a, const int64_t *b)
+/* leverages the fact that comparison expressions return 1 or 0. */
+static int cmp(const void *aa, const void *bb)
 {
-    if (*a == *b)
-        return 0;
-    return (*a - *b) > 0 ? 1 : -1;
+    int64_t a = *(const int64_t *) aa, b = *(const int64_t *) bb;
+    return (a > b) - (a < b);
 }
 
 /* This function is used to set different thresholds for cropping measurements.
@@ -82,8 +82,7 @@ static int cmp(const int64_t *a, const int64_t *b)
  */
 static void prepare_percentiles(int64_t *exec_times, int64_t *percentiles)
 {
-    qsort(exec_times, N_MEASURES, sizeof(int64_t),
-          (int (*)(const void *, const void *)) cmp);
+    qsort(exec_times, N_MEASURES, sizeof(int64_t), cmp);
 
     for (size_t i = 0; i < NUM_PERCENTILES; i++) {
         percentiles[i] = percentile(

jserv

Ensure that Change-Ids appear in the commit messages.

BennyWang1007 · 2025-03-25T10:08:53Z

Consider the changes below:

--- a/dudect/fixture.c
+++ b/dudect/fixture.c
@@ -67,11 +67,11 @@ static int64_t percentile(const int64_t *a_sorted, double which, size_t size)
     return a_sorted[pos];
 }
 
-static int cmp(const int64_t *a, const int64_t *b)
+/* leverages the fact that comparison expressions return 1 or 0. */
+static int cmp(const void *aa, const void *bb)
 {
-    if (*a == *b)
-        return 0;
-    return (*a - *b) > 0 ? 1 : -1;
+    int64_t a = *(const int64_t *) aa, b = *(const int64_t *) bb;
+    return (a > b) - (a < b);
 }
 
 /* This function is used to set different thresholds for cropping measurements.
@@ -82,8 +82,7 @@ static int cmp(const int64_t *a, const int64_t *b)
  */
 static void prepare_percentiles(int64_t *exec_times, int64_t *percentiles)
 {
-    qsort(exec_times, N_MEASURES, sizeof(int64_t),
-          (int (*)(const void *, const void *)) cmp);
+    qsort(exec_times, N_MEASURES, sizeof(int64_t), cmp);
 
     for (size_t i = 0; i < NUM_PERCENTILES; i++) {
         percentiles[i] = percentile(

Your implementation is solid—it not only removes branching but also optimizes performance. I analyzed the compiled machine code from common C compilers, and in most cases, your approach results in a more efficient execution.

I have added your implementation in commit ba691d0. Thanks for the review!

jserv · 2025-03-25T12:05:33Z

I defer to @Andrushika for confirmation.

jserv · 2025-03-25T12:10:04Z

I have added your implementation in commit ba691d0. Thanks for the review!

The purpose of the branchless comparison function was not security mitigation against side-channel attacks, but rather improvement for computational efficiency. See Branchless Programming.

Andrushika

There are a few minor parts that could be improved. Please take a look at the suggestions.

Andrushika · 2025-03-25T14:07:10Z

dudect/fixture.c

+    t_context_t *t = max_test();
    double max_t = fabs(t_compute(t));


Since the only use of *t is to calculate max_t, I recommend reducing the redundant code:

double max_t = max_test();

Also, remember to change the return value and types of max_test().

It also be used to calculate number_traces_max_t to determine whether the measurements is enough.

Got it, it was my fault, sorry for misunderstanding.

Andrushika · 2025-03-25T14:13:46Z

dudect/fixture.c

+    t_context_t *t = max_test();
    double max_t = fabs(t_compute(t));
    double number_traces_max_t = t->n[0] + t->n[1];
    double max_tau = max_t / sqrt(number_traces_max_t);


It is better to calculate max_t and max_tau after checking whether number_traces_max_t < ENOUGH_MEASURE, to avoid unnecessary calculation.

I can modify max_test to return the number_traces_max_t and pass double *max_t as an argument to retrieve the max_t values, preventing unnecessary recalculations.
However, this may slightly reduce the readibility. Do you think it's worth it?

/* This function returns the number of measurements of the test with the * maximum t value. And sets max_t to the maximum t value. */ static double *max_test_size(double *max_t) { ... } static bool report(void) { double *max_t; double number_traces_max_t = max_test_size(*max_t); ... }

Andrushika · 2025-03-27T13:36:54Z

@jserv I've reviewed the changes and everything looks good. It's ready for merge!

BennyWang1007 · 2025-03-27T22:43:00Z

Sorry for the commits. Due to two previous commits where the commit hook failed to add the Change-Id properly, running make test results in errors for all test cases. To resolve this issue, I am recommitting them with the correct Change-Id.

jserv

Squash git commits and refine commit messages.

This commit introduces several improvements to the dudect test, including cropped time analysis and performance optimizations. - Remove outliers caused by context switches, interrupts, or system activity using a percentile-based threshold. - Store measurements in multiple t-test contexts to track t-tests in different percentile thresholds. - Fix integer overflow and improve the efficiency in the 'cmp()' function by using a branch-free comparison '(a > b) - (a < b)'. - Optimize the calculation of 'max_t' and 'max_tau' by deferring computations until necessary, reducing unnecessary calculation when measurements are insufficient. Change-Id: Icb910a8b9d93305da8e478e7e79cd25891c9e72e

BennyWang1007 · 2025-03-28T16:40:07Z

Squash git commits and refine commit messages.

I've finished squashing the commits. Let me know if anything needs further refinement.

jserv · 2025-03-29T00:20:25Z

I've finished squashing the commits. Let me know if anything needs further refinement.

Instead of leaving messages, you can simply press the review button on GitHub next time.

See https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/requesting-a-pull-request-review

jserv · 2025-03-29T00:23:20Z

Thank @BennyWang1007 for contributing!

jserv reviewed Mar 24, 2025

View reviewed changes

dudect/fixture.c Outdated Show resolved Hide resolved

jserv requested changes Mar 25, 2025

View reviewed changes

Andrushika suggested changes Mar 25, 2025

View reviewed changes

BennyWang1007 force-pushed the dudect_fix branch from 320a2b9 to 7e95512 Compare March 27, 2025 22:39

jserv requested changes Mar 28, 2025

View reviewed changes

BennyWang1007 force-pushed the dudect_fix branch 3 times, most recently from 760d361 to 0c8b186 Compare March 28, 2025 16:35

BennyWang1007 force-pushed the dudect_fix branch from 0c8b186 to 83077d6 Compare March 28, 2025 16:37

jserv merged commit f53314e into sysprog21:master Mar 29, 2025
1 of 2 checks passed

		t_context_t *t = max_test();
		double max_t = fabs(t_compute(t));

Improve dudect test with cropped time analysis #283

Improve dudect test with cropped time analysis #283

Uh oh!

Conversation

BennyWang1007 commented Mar 22, 2025

Uh oh!

Andrushika commented Mar 23, 2025

Uh oh!

BennyWang1007 commented Mar 23, 2025

Uh oh!

Uh oh!

jserv commented Mar 25, 2025

Uh oh!

jserv left a comment

Choose a reason for hiding this comment

Uh oh!

BennyWang1007 commented Mar 25, 2025

Uh oh!

jserv commented Mar 25, 2025

Uh oh!

jserv commented Mar 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Andrushika left a comment

Choose a reason for hiding this comment

Uh oh!

Andrushika Mar 25, 2025

Choose a reason for hiding this comment

Uh oh!

BennyWang1007 Mar 27, 2025

Choose a reason for hiding this comment

Uh oh!

Andrushika Mar 27, 2025

Choose a reason for hiding this comment

Uh oh!

Andrushika Mar 25, 2025

Choose a reason for hiding this comment

Uh oh!

BennyWang1007 Mar 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Andrushika commented Mar 27, 2025

Uh oh!

BennyWang1007 commented Mar 27, 2025

Uh oh!

jserv left a comment

Choose a reason for hiding this comment

Uh oh!

BennyWang1007 commented Mar 28, 2025

Uh oh!

jserv commented Mar 29, 2025

Uh oh!

Uh oh!

jserv commented Mar 29, 2025

Uh oh!

Uh oh!

jserv commented Mar 25, 2025 •

edited

Loading

BennyWang1007 Mar 27, 2025 •

edited

Loading