set optimization from -O2 to -O3, 3.4 % performance increase observed #278
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
I've seen a runtime decrease of 3.4% on average over three runs in one example raising the optimization level from -O2 to -O3 in gcc.
bwa uses lots of bit shift operators. It seems -O3 makes a difference here, see this trivial example:
https://www.godbolt.org/z/jD_cC5
I aligned one million reads to a mm10 using the command line
bwa aln -t 4 -f reads.sai mm10.fa Andreas_BWA/SRR1519948.1_1000000.fastq
Time went down from 62.8s to 60.6s.