Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

qiime feature-table rarefy --p-with-replacement: sum(pvals[:-1]) > 1.0 #245

Open
nick-youngblut opened this issue Mar 9, 2021 · 7 comments · Fixed by biocore/biom-format#961

Comments

@nick-youngblut
Copy link

Bug Description
Running qiime feature-table rarefy --p-with-replacement sometimes generated the error: sum(pvals[:-1]) > 1.0

This is likely a float rounding issue.

Steps to reproduce the behavior

  • qiime feature-table rarefy --p-with-replacement --p-sampling-depth 500000

The counts per sample for my feature table are:

   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
2381744 2763866 2855986 2929877 2987576 4566374 

...so it's not a problem that just occurs with a very small sample size (eg., n = 1 or n = 10)

Computation Environment

  • OS: Ubuntu 18.04.5
  • QIIME 2 Release: qiime2 2020.8.0
@thermokarst thermokarst transferred this issue from qiime2/qiime2 Mar 9, 2021
@thermokarst
Copy link
Contributor

Thanks for reporting, @nick-youngblut. I think this error might be originating in the biom package, would you mind running this little bit of python code (using the offending data) to see if you can recreate it in pure biom?

import qiime2
import biom

artifact = qiime2.Artifact.load('table.qza')
table = artifact.view(biom.Table)

table.subsample(500000, axis='sample', by_id=False, with_replacement=True)

@nick-youngblut
Copy link
Author

I can't seem to reproduce the error, so it appears to occur rarely.

@thermokarst
Copy link
Contributor

Thanks @nick-youngblut. I don't think this issue can be resolved in this QIIME 2 plugin - the rarefy method just wraps biom, so I'll keep this open for now, in case you find a more reliable test case. Thanks!

@nick-youngblut
Copy link
Author

nick-youngblut commented Mar 18, 2021

I ran into the issue again, and I was able to confirm that the issue is caused by biom:

Python 3.6.10 | packaged by conda-forge | (default, Apr 24 2020, 16:42:08)
[GCC 7.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import qiime2
>>> import biom
>>> artifact = qiime2.Artifact.load('otu.qza')
table = artifact.view(biom.Table)
>>> table.subsample(500000, axis='sample', by_id=False, with_replacement=True)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/ebio/abt3_projects/test_project/bin/llmgp/.snakemake/conda/5b653c1a/lib/python3.6/site-packages/biom/table.py", line 2824, in subsample
    _subsample(data, n, with_replacement)
  File "biom/_subsample.pyx", line 53, in biom._subsample._subsample
  File "mtrand.pyx", line 4214, in numpy.random.mtrand.RandomState.multinomial
ValueError: sum(pvals[:-1]) > 1.0

I guess that I should post the issue on https://github.com/biocore/biom-format

@mortonjt
Copy link

mortonjt commented Apr 8, 2024

See the link above -- @nick-youngblut maybe it is possible that you had fractional values in the biom table? Rounding to ints seems to have resolved this issue.

@wasade
Copy link
Member

wasade commented Apr 10, 2024

Thank you, @mortonjt, for opening the issue on the biom-format tracker. I was unaware of this edge case, we'll look at getting it addressed in the next release.

@wasade
Copy link
Member

wasade commented May 7, 2024

This issue was addressed in biocore/biom-format#961 and it may make sense to close this issue.

As a general comment, please do consider opening issues when appropriate with affected projects so problems can be resolved in a timely manner.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants