Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tutorial sections and SQLite database for MONA Spectral Database (.msp) #27

Open
Tony-II opened this issue Oct 19, 2021 · 4 comments
Open

Comments

@Tony-II
Copy link

Tony-II commented Oct 19, 2021

Hello jorainer,

First of all many thanks for this utterly great and comprehensible tutorial and of course all the wonderful work hidden in the Spectra package.

I would like to ask some questions about it:

1.) Could function compareSpectra() always return type matrix (array)?. This would it ease up the determination of best_match in case of length(mbank_sub) == 1 or length(sps_sub) == 1.
Implicetely I would suggest to introduce a drop = F argument in the return statement of the function for clarity so as to have the individual correct row and col indicees for the respective matching partners (indicess db_sub & sps_sub) ...

2.) As there is also an equivalent for the SQL version of MassBank, namely MONA in https://github.com/computational-metabolomics/msp2db/releases/tag/v0.0.14-mona-23042021
would it be possible to have that included in the turial? Alternatively a tutorial section on how to compare against .msp databases (MsBackendMsp)) would be great.

3.) Any hint on parallelisation would be helpful, especially in case for parallel db-connections when several hundreds of features should be compared.

many thanks
kind regards
Tony

@jorainer
Copy link
Owner

Thanks for the feedback!

  1. Sure, that's a very good suggestion. I myself never used the MsBackendMsp backend but I will look into that. Regarding the database, this is a non-official database dump, right? In what format is MoNa providing the data in general? Back then when I checked it it was their own yaml format... not ideal...

  2. For the parallel (and maybe more convenient) matching, have a look also at the new tutorial I added: https://jorainer.github.io/SpectraTutorials/articles/Spectra-matching-with-MetaboAnnotation.html , pre-filtering by matching precursor m/z significantly increases the performance. Alternatively, there is the possibility to use BPPARAM = MulticoreParam(4) or equivalent to run comparisins in parallel. Only, since there are no parallel connections to the database, the parallel processing is limited to the data after it was retrieved from the database.

@jorainer
Copy link
Owner

Ah, and regarding 1): I've opened an issue in Spectra.

@jorainer
Copy link
Owner

Again, regarding 1): compareSpectra has a parameter SIMPLIFY. If you set that to SIMPLIFY = FALSE it will always return a matrix. We'll look into changing the default SIMPLIFY = TRUE into SIMPLIFY = FALSE, but for now it should work if you do that manually.

@Tony-II
Copy link
Author

Tony-II commented Oct 24, 2021

many thanks so far! Regarding:
1.) this works really well ...
2.) if the format under https://github.com/computational-metabolomics/msp2db/releases/tag/v0.0.14-mona-23042021
is not the ideal one, there would be an alternative for other formats under:

https://mona.fiehnlab.ucdavis.edu/downloads
The availabe formats offered here are:
.json
.msp (NIST compatible)
.sdf (NIST compatible)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants