-
-
Notifications
You must be signed in to change notification settings - Fork 771
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement O(n) sort for inverted selection #1904
Conversation
@KlzXS can you please review this? |
@jarkkojs CI fails with the following:
|
I'll check what is going on, will return soon'ish. |
QuickSort produces O(nlog n) at best and O(n^2) at worst. Sort selection markers with a trivial radix sort algorithm, which guarantees O(n) time. As a consequence both time and space consumption will be linear. Do not generalize the algorithm, as other sites most likely require a comparison based sorting algorithm. For those sites the goal should be to eliminate the worst case O(n^2) and have a steady O(nlog n) performance. Signed-off-by: Jarkko Sakkinen <[email protected]>
I also wrote a better commit message:
E.g. heap sort is much better option for comparison cases... These set actual guarantees for order of the function (in both time and space dimensions). And further, they are somewhat de-facto choices for e.g. embedded system with low specs, which I guess fit tho this project quite well. |
We use the libc I'll look into the patch. Though I'm skeptical of whether the performance improvement (if any) is actually worthwhile or not. So far I've never run into a situation where sorting was the bottleneck. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are mostly code-style related change requests. The core logic seems good.
I'll test it out and try to benchmark it, but as @N-R-K said I'm not expecting a real performance boost.
selmark *marked = malloc(nselected * sizeof(selmark)); | ||
selmark *marked = calloc(nselected, sizeof(*marked)); | ||
selmark *scratch = calloc(nselected, sizeof(*scratch)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we replacing the actual struct selmark
with this dereferencing stuff? Is this even well-formed C? Initializing a pointer with a value that dereferences said pointer seems like an obvious no-no to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, I just realized that it's changed to calloc()
. Why? I don't think we need that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this even well-formed C? Initializing a pointer with a value that dereferences said pointer seems like an obvious no-no to me.
Yeah, that's fine. sizeof
only needs to know the type of the operand in order to derive it's size. It's guaranteed not to evaluate anything (with the exception when the operand is a VLA, one more reason I hate VLAs).
For reference C11 6.5.3.4:
If the type of the operand is a variable length array type, the operand is evaluated; otherwise, the operand is not evaluated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The more you know. I'm still more for sizeof(selmark)
, using an identifier during its initialization just looks off to me. Up to you @jarkkojs.
calloc()
still unnecessary and I'd still move scrath
into the sorting function.
void *tmp; | ||
int b; | ||
|
||
for (b = 0; b < 8; b++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The magic 8 being?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had intended to comment on this. I think it's assuming sizeof(size_t) == 8
, but that's not always true, nnn
should work fine on 32bit systems too.
EDIT: just noticed the comment down below, looks like you figured it out :D
|
||
for (i = 0; i < nelem; i++) { | ||
value = (size_t)data[i].startpos; | ||
j = idx[(value >> (b * 8)) & 0xFF]++; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only here do we learn what b
actually is and where the magic 8 comes from. It's the index of the byte in value
.
But this also assumes that size_t
is actually 64-bit i.e. 8 bytes long. I think we support 32-bit and systems where size_t
may not be 8 bytes.
Adjust the names and the limit for b
to depend on the size of size_t
.
size_t cnt[256]; | ||
size_t idx[256]; | ||
size_t value; | ||
size_t i, j; | ||
void *tmp; | ||
int b; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer these names to be a bit more descriptive. What is b
? cnt
and idx
make sense if you know what radix sort does, so that's more or less fine. i
is an iteration variable that's fine, but j
isn't, it's used for something completely unrelated.
Also small nitpick, we know the type of tmp
, it should be selmark *
just as data
and scratch
are.
@@ -1850,6 +1881,7 @@ static void invertselbuf(const int pathlen) | |||
} | |||
|
|||
free(marked); | |||
free(scratch); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should scratch
really be allocated here? It's only used in sort_marked()
and I see no benefit in allocating it in this function. Might as well just allocate and free it there.
|
||
return ma->startpos - mb->startpos; | ||
idx[0] = 0; | ||
for (i = 1; i < 256; i++) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We mostly use ++i
. Let's keep it consistent.
@KlzXS do you really see any merit in this change? I would prefer to stick to the library function. |
There is some merit to it, yes. How much exactly is debatable. On a modern system? Probably little to none. Unless you get On something like a router or another embedded low power device? It would be better due to the raw difference in processing power. A 100MHz CPU would be happier with radix sort. Though the searching overhead is still there. |
@jarkkojs can you please share some performance numbers (along with the processor details, if possible on a regular smartphone)? |
If you want to reduce the searching overhead, we could put all the filenames in a hash table and then in one pass through Again, I'm not sure how impactful that would be or if we should even bother improving this function at all other than for the sake of bragging rights for it being "optimal". I'll leave that for you to consider. |
OK, I'll close this. |
No description provided.