FlatGFA: Hand-rolled GFA parser #154

sampsyo · 2024-03-17T20:33:50Z

The next bottleneck in GFA parsing was the external rs-gfa library. I replaced this with a hand-rolled one.

Using the same measurement setup as #153:

So that's another 1.9x and 1.5x speedup over the last set of optimizations, for a total of 4x speedup over the first version.

It's clear that the bottleneck now is in the memcpying to the destination files, which is also avoidable (with some compromises).

This reverts commit 8b62c4c. Somehow, this was actually slower (and added an `unsafe`)!

This is the last piece!

sampsyo added 17 commits March 17, 2024 12:31

Tiny start for hand-rolled parser

bfa1da5

Hand-roll segment parser

18f37ae

Hand-rolled parser for links

b98c4f8

Thanks, Clippy

410d8a5

Hand-roll a path parser

f33d7f1

Relocate the steps parser

ec8c178

Drop old parser

fec913d

Li'l refactor

b456409

Simplify path deferral

cc2462b

Even "streamier" path parsing

cd96342

Never create strings

12b7105

Unsafe number parsing!!!

9a7529c

Maybe-faster number parsing

8b62c4c

Revert "Maybe-faster number parsing"

7806705

This reverts commit 8b62c4c. Somehow, this was actually slower (and added an `unsafe`)!

Comments and types

37795d9

Hand-roll CIGAR parser

d588d5b

This is the last piece!

Remove dependency on parser library

d2eb84c

sampsyo merged commit 031f235 into main Mar 17, 2024
3 checks passed

sampsyo deleted the polbin-hand-parse branch March 17, 2024 21:50

sampsyo mentioned this pull request Mar 19, 2024

Use the memchr and atoi crates #157

Merged

Provide feedback