Open
Description
Describe the bug
Hello everyone, currently i am trying to index large peptide fasta files (~50 GB) for peptide searches. This fasta contains 85748938 entries of short peptides (all of them are unique). I am using the SABuild function and call it as follows:
java -Xmx256000M -cp <PATH>/MSGFPLUS_v20220418/MSGFPlus.jar edu.ucsd.msjava.msdbsearch.BuildSA -d peptides.fasta -tda 1 -decoy XXX
and getting the following Error from MSGF+:
Creating peptides.revCat.fasta.
Building suffix array: /mntc/<PATH>/work/f5/71e50c34429da341c0ad240e4f40ed/peptides.revCat.fasta
Exception in thread "main" java.lang.NegativeArraySizeException: -541141435
at edu.ucsd.msjava.msdbsearch.CompactFastaSequence.readSequence(CompactFastaSequence.java:542)
at edu.ucsd.msjava.msdbsearch.CompactFastaSequence.<init>(CompactFastaSequence.java:139)
at edu.ucsd.msjava.msdbsearch.CompactFastaSequence.<init>(CompactFastaSequence.java:89)
at edu.ucsd.msjava.msdbsearch.BuildSA.buildSAFiles(BuildSA.java:144)
at edu.ucsd.msjava.msdbsearch.BuildSA.buildSA(BuildSA.java:96)
at edu.ucsd.msjava.msdbsearch.BuildSA.main(BuildSA.java:56)
This leads to the following lines here.
I was wondering if this error could be fixed quickly, since i would like to use MSGF+ for identification, even for these large fastas i am using here. Maybe it is only a simple manner of using long
instead of int
, because of an possible overflow happening here. But i cannot judge if other places need to be adjusted.
Metadata
Metadata
Assignees
Labels
No labels