Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Duplicate observation IDs error even no duplication was found #976

Open
mortonjt opened this issue Sep 1, 2024 · 3 comments
Open

Duplicate observation IDs error even no duplication was found #976

mortonjt opened this issue Sep 1, 2024 · 3 comments

Comments

@mortonjt
Copy link
Contributor

mortonjt commented Sep 1, 2024

Issue was first reported here scikit-bio/scikit-bio#2121

@quliping let's continue the discussion here for future reference

@wasade
Copy link
Member

wasade commented Sep 1, 2024

Odd. Could be an evaluation order or in the errcheck. What happens if the table is loaded directly into biom without using pandas?

@quliping
Copy link

quliping commented Sep 4, 2024

Odd. Could be an evaluation order or in the errcheck. What happens if the table is loaded directly into biom without using pandas?

Sory for the late reply. I tried Table.from_tsv(), and it worked. I'm not sure if I should use this command, but the table was loaded into biom successfully.

And I think I found the reason... I tried my orignal script, asv_table = Table(df.T.values, observation_ids, sample_ids) worked but asv_table = Table(df.values, observation_ids, sample_ids) got the 'Duplicate observation IDs' error. It seems like I made a mistake for the row and column contents. Each column should be a sample and each row should be a species, right? I got it wrong before (I inverted my table). I think the error message is confusing and perhaps should change to something like 'The number of columns is different from the number of given sample ids, please check if there is a mistake in the placement of samples and observations in given table.' Because there are indeed no duplicate ids.

@wasade
Copy link
Member

wasade commented Sep 4, 2024

A traditional feature table, dating from QIIME 1 days, has rows as observations and that is modeled in BIOM. That's correct though that the error message is wrong, as it seems this should trigger a shape or related error, and that would be valuable to fix. Thank you for reporting!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants