Skip to content
This repository has been archived by the owner on Sep 4, 2024. It is now read-only.

split_occurrence_data uses the fields encountered in the first file for all records #361

Open
zzeppozz opened this issue Sep 9, 2022 · 0 comments

Comments

@zzeppozz
Copy link
Contributor

zzeppozz commented Sep 9, 2022

Describe the bug

split_occurrence_data defines the output fields for all records by those in the first file. It also adds species_name, x, y fields to all records. This may not be a problem, but it is unexpected.

To Reproduce

Run split_occurrence_data with an iDigBio DWCA, again with a GBIF DWCA, and again with both. The output from processing the iDigBio data or both iDigBio and GBIF contains only iDigBio fields. The configuration file for processing both data files specifies the iDigBio data first, and it (a DWCA) is encountered first in the code.

Expected behavior

Records should contain a union of all fields from all input files; any fields that occur in more than one input file only appear once. Values are filled in when they are present in the input record.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant