Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How can I define amimal diets using Biological Interactions? #25

Open
karilint opened this issue May 4, 2022 · 13 comments
Open

How can I define amimal diets using Biological Interactions? #25

karilint opened this issue May 4, 2022 · 13 comments

Comments

@karilint
Copy link

karilint commented May 4, 2022

Hi,
The dietary information is used in many ecological (like food webs) and macroecological (community structures by dietary guilds) and paleontological (teeth vs. diets) studies.

Many scientific papers list animals and their diets. However, these studies are based on different methodologies and the results are reported in many ways. The general idea of these studies is to find out what animals eat and what are the proportions of the diet items. Currently, there is no standard way of reporting/sharing dietary data and I was wondering if the Biological Interactions Data interest group could help with this matter.

Quite often the diet composition data contains information like:

food items, life stages and parts consumed (verbatim and scientific names)

  • food item consumed (e.g. termites)
  • food item + life stage consumed (e.g. mayfly larvae)
  • food item + part consumed (e.g. Leaves of Wild Bitter Yam (Dioscorea Dumetorum))

proportion, share or importance:

  • as percentages by volume or frequency (dwc:measurementValue)
  • or as a list order from the most consumed to the least important food item (something like ggbn:sequence)

with a measurement method

  • observer feeding
  • faeces, digestive tract or stomach content
  • time spent feeding

the study time (similar to dwc:verbatimEventDate) and sampling effort (dwc:samplingEffort)

  • 'between November 1989 and October 1990'
  • 'during spring', '3 months'

location of the analysed diet

  • 'Poco das Antas Biological Reserve, Brazil' (dwc:verbatimLocality)

with the name of the data source and a possible cited reference in the data source

  • dwc:references, dwc:associatedReferences

These are only a few terms that relate to animal diets but I suppose that many dwc terms already could be used.

Would animal diets be one sample case for Biological Interactions?

@jhpoelen
Copy link
Contributor

jhpoelen commented May 4, 2022

Hi @karilint - yes, I'd say that animal diets is an example of Biological Interactions. As you might know, many of your colleagues have captured diets in digital form, including the important details that you mentioned (e.g., lifestage, frequency, date ranges, body part consumed). You might get some inspiration by looking at the interaction datasets indexed by https://globalbioticinteractions.org/sources. In specific, you might want to review the rigorous approach that @ahhurlbert et al. are using the maintain and extend their Avian Diet Database https://github.com/hurlbertlab/dietdatabase .

Curious to hear what you come up with.

@karilint
Copy link
Author

karilint commented May 4, 2022

Hi @jhpoelen ,
thank you for those interesting links and examples. I myself have a large dataset of mammalian diets (similar to the Avian Diet Database). The globalbioticinteractions site seems to be very interesting and may provide a platform for the data I have. However, what ideally would be great is a common use of terms that could be used for describing animal diets, including the parts/life stages eaten and the proportions.

@jhpoelen
Copy link
Contributor

jhpoelen commented May 4, 2022

@karilint many folks use:

Uberon for lifestages / body parts.

Relations Ontology for Biotic Interaction terms

For the proportions, I've seen various different measures (% stomach volume, stomach volume, relative frequency of occurrence, etc.) . I'd suggest to document what you have today, and then, time permitting standardize terms over time. This is to avoid analysis paralysis. https://en.wikipedia.org/wiki/Analysis_paralysis .

@jhpoelen
Copy link
Contributor

jhpoelen commented May 4, 2022

By capturing all the details you need, and translating (where possible) to other formats (like DwC), you retain the original details, while also offering a more "standardized" perspective.

@karilint
Copy link
Author

karilint commented May 6, 2022

@jhpoelen I've been discussing with people that created the Ecological Traitdata Standard (ETS) https://github.com/EcologicalTraitData/ETS. It has most of the things I need (only a couple of terms missing). I have already 'documented' what I have but missed the standardisation part because of a lack of knowledge.

I got good suggestions from ETS people for finding/creating a vocabulary for animal diets (for example the Biological Interactions). My dataset is a compilation of diets for 4453 mammalian species, having 26849 rows of 'dietary items'. I'm more than happy to share a sample of the data if someone could help me with mapping the current terms and standardising the new ones. My aim is not just to publish the data set but also to enable sharing and updating it using standard terms.

@jhpoelen
Copy link
Contributor

jhpoelen commented May 6, 2022

@karilint I like your idea to share a sample so other can chime in on what existing terms or datasets might be useful for you to look at.

If you'd like, I can help index the example by GloBI so that we can see how your rich datasets fits into the GloBI indexes.

@karilint
Copy link
Author

karilint commented May 8, 2022

Hi @jhpoelen ,
I created 2000 rows sample of the data set. It can be viewed as a Google Sheet at https://docs.google.com/spreadsheets/d/1rZGkI-lyKkKWNH3eMOOm-HYot2WGiZOT-qY7E811l2o/edit?usp=sharing

The basic idea is that many of the ETS vocabulary terms fit very well for my purpose (although the entities Traitdata and MeasurementOrFact are probably wrong). The other terms I have at the end of the file are vaguer: DWC_samplingEffort, DWC_verbatimEventDate, DWC_associatedReferences, GGBN_sequence, DWC_AssociatedTaxa, ABCDEFG_PartOfOrganism. For one, NO_NAME_verbatimAssociatedTaxa I have not found any comparable term.

So, based on the sample data, do you think there is a possibility to use the Biological Interactions and describe the data with more appropriate terms? I'm a bit out of my comfort zone here.

@jhpoelen
Copy link
Contributor

@karilint apologies for the delay! I am still planning to look at this sooner rather than later. Please do poke me if you don't hear from me by the end of this week.

@karilint
Copy link
Author

@jhpoelen excellent! I'm very grateful for your help.

@karilint
Copy link
Author

Hi @jhpoelen ,
I've also been quite busy lately. For your information, our submitted manuscript on mammalian diets will be published within three months or so. Before I submit the last version of the manuscript, it would be great to have the terms in place so that I can use the correct ones for future data imports/exports.
All the best!

@karilint
Copy link
Author

karilint commented Aug 2, 2022

Hi @jhpoelen ,
any new ideas on the matter?

@jhpoelen
Copy link
Contributor

jhpoelen commented Aug 15, 2022

Apologies for the delay, and thank you for reminding me.

I had a look at your data sheet, at https://docs.google.com/spreadsheets/d/1rZGkI-lyKkKWNH3eMOOm-HYot2WGiZOT-qY7E811l2o/edit#gid=764931096 .

I noticed you use a wide table format: one row contains all the information you need; I like wide table formats because I don't have switch between table to do my analysis.
Also, indexing wide tables are easier handle in GloBI.

But . . . standards like ETS and their cousin DwC-A are designed to put things in separate tables. Typically, you'd have separate files for occurrences, measurementOrFact, taxa, etc. Then these files would be referenced in meta.xml . This file describes the meaning of these tables and how they relate.

In my experience, ETS/DwC-A are pretty nice to exchange data with your colleagues, but for editing, and analysis the wide table format is a little easier to manage, especially when working with spreadsheet programs.

To get the best of both worlds, I'd suggest to use the wide format to do your own management and analysis. And include a description of the fields with some examples. Then, if you have time, you can transform the wide table format into a ETS package and include it in your publication.

With this approach, you can use terms that ETS / DwC-A do not have (yet) like verbatimAssociatedTaxa, and split out the verb of the interaction from their object (e.g., separate columns for the interaction type like "eats" from the object like "Termites").

In other words, I think you are trying to solve two different problems (manage/describe/capture data, exchange data) with one solution. This is pretty challenging, because managing data and exchanging data are very different beasts in my mind. To avoid this, I typically try to first solve one problem: capture the data in a form that works for me. Then, time permitting, I'd focus on the exchange part. And, if you don't have time, you can always add it later because you've captured the original data.

Hope this helps. If not, please holler and I'd be happy to go over during a video chat if you'd like.

@karilint
Copy link
Author

Hi @jhpoelen ,
thank you for this information. I'll be on fieldwork for the next four weeks or so. I'll check this in more detail after it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants