Old files archive are not correct #58

SantaMcCloud · 2024-09-07T15:40:57Z

Hello,

sorry for writing the issue here, since I didn't find an email to contact any of the CAMI staff. I'm currently working on my bachelor thesis which including building a workflow on the web server https://usegalaxy.eu/ which serve a lot of different tools in the bioinformatic fields. Since amber is up there now, I need some benchmarks to test the workflow and I did discover that you are providing the old archive like cami low or mouse gut toy etc. I did work with the cami low and the mouse gut toy low archives, but I also want to test the high or medium archive as well, and now there is the problem. I did download both tarballs [from http://gigadb.org/dataset/100344] and unzip them, but only to get the samples without any other file while there should be also the gsa and binning which are not there in both tarballs. Is it possible to fix this, or is there any other source which contain the correct tarball as download?

This would be a great help and thank you in advance and again I'm sorry if this topic is wrong here!

fernandomeyer · 2024-09-13T05:52:48Z

You can download the binning gold standards for the Medium and High pooled assemblies here:
Medium: https://openstack.cebitec.uni-bielefeld.de:8080/swift/v1/CAMI_I_MEDIUM/pooled_gsa_mapping.binning.tsv
High: https://openstack.cebitec.uni-bielefeld.de:8080/swift/v1/CAMI_I_HIGH/gsa_mapping_pool.binning
Other files are available, as in the description of each dataset at https://data.cami-challenge.org/participate. The camiClient.jar can be useful sometimes to list and download available files.

SantaMcCloud · 2024-10-05T00:54:22Z

Yes this did help, thank you, but there is a problem with the high dataset. The reads and the binning files of the sample doesn't have matching IDs. I don't know if this is only the problem since the reads are download from gigadb and not from the openstack. I tried to download it from there, but I don't have the access for it, at least for the first sample, the other I did not try.

Then I tried to switch to the CAMI2 Toy set which has result in these repositories but the archive in the dataset directory missing some 'tar.gz' files for example sample_3 only has the contig in there, but the reads are missing. Since there is a 'README.txt' file for every file, I assume the missing files should be in there? Could it be possible to make the missing file accessible or not?

Sorry for this kind of question but if there is no possible way to update this archive or fix the mismatching of the sequence IDs from the CAMI high dataset just let me know it!

Thanks you in advance!

SantaMcCloud changed the title ~~Old files archive not are not correct~~ Old files archive are not correct Sep 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Old files archive are not correct #58

Old files archive are not correct #58

SantaMcCloud commented Sep 7, 2024

fernandomeyer commented Sep 13, 2024

SantaMcCloud commented Oct 5, 2024

Old files archive are not correct #58

Old files archive are not correct #58

Comments

SantaMcCloud commented Sep 7, 2024

fernandomeyer commented Sep 13, 2024

SantaMcCloud commented Oct 5, 2024