split_occurrence_data uses the fields encountered in the first file for all records #361

zzeppozz · 2022-09-09T14:49:02Z

Describe the bug

split_occurrence_data defines the output fields for all records by those in the first file. It also adds species_name, x, y fields to all records. This may not be a problem, but it is unexpected.

To Reproduce

Run split_occurrence_data with an iDigBio DWCA, again with a GBIF DWCA, and again with both. The output from processing the iDigBio data or both iDigBio and GBIF contains only iDigBio fields. The configuration file for processing both data files specifies the iDigBio data first, and it (a DWCA) is encountered first in the code.

Expected behavior

Records should contain a union of all fields from all input files; any fields that occur in more than one input file only appear once. Values are filled in when they are present in the input record.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

split_occurrence_data uses the fields encountered in the first file for all records #361

split_occurrence_data uses the fields encountered in the first file for all records #361

zzeppozz commented Sep 9, 2022

split_occurrence_data uses the fields encountered in the first file for all records #361

split_occurrence_data uses the fields encountered in the first file for all records #361

Comments

zzeppozz commented Sep 9, 2022

Describe the bug

To Reproduce

Expected behavior