Add a CSV/XLSX file reader to core #656

davidorme · 2025-01-07T10:42:26Z

Both the plants and animals models require the users to provide cohort data. For plants, this is providing tuples of data:

(cell id, plant functional type, number of individuals, individual size)

There can be multiple entries per cell id and different numbers of cohorts per cell. The easiest and sanest format for this data is a simple data frame of those tuples and the natural format for creating and maintaining that data is a CSV or XSLX file. Forcing users to convert this into NetCDF for input is not sensible.

So, we need to:

Add a CSV/XLSX loader.
This should use pandas as that is already a requirement of xarray and is designed explicitly to handle data frames, rather than using the standard library csv or any of the numpy structures.
I think we will need to explicitly add openxlsx to [tool.poetry.dependencies] to support reading XLSX format.
Test that it works!

It should go in virtual_ecosystem.core.readers and I think the signature will look like:

@register_file_format_loader(file_types=(".csv", ".xlsx"))
def load_from_dataframe(file: Path, var_name: str) -> DataArray:
    """Loads a DataArray from a data frame format."""

The format registry should then automatically switch to using this loader for CSV and XLSX files.

There is some ugliness here in that the file is going to be opened multiple times to load each variable as we don't have persistent file handles, but the same is currently true for NetCDF. A better way to do this in future would be to open each file within the data configuration once to access a tuple of variables that are claimed to live in that file, rather than independently opening the file specified for each variable.

The text was updated successfully, but these errors were encountered:

davidorme assigned davidorme and sallymatson and unassigned davidorme Jan 7, 2025

davidorme mentioned this issue Jan 7, 2025

Add a data method to check variables form a data frame #657

Open

sallymatson linked a pull request Jan 9, 2025 that will close this issue

Csvxlsx file reader #664

Open

8 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a CSV/XLSX file reader to core #656

Add a CSV/XLSX file reader to core #656

davidorme commented Jan 7, 2025

Add a CSV/XLSX file reader to core #656

Add a CSV/XLSX file reader to core #656

Comments

davidorme commented Jan 7, 2025