We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
This Parquet file has a subset of an original BIN file.
https://github.com/hypertidy/L3bin/blob/master/inst/extdata/AQUA_MODIS.20241125.L3b.DAY.CHL.NRT.parquet
The BIN file from here: https://oceandata.sci.gsfc.nasa.gov/cgi/getfile/AQUA_MODIS.20241125.L3b.DAY.CHL.NRT.nc
R code to convert (using rhdf5) just the bin index and the chlor_a values:
library(croc) ## gh: sosoc/croc file <- "AQUA_MODIS.20241125.L3b.DAY.CHL.NRT.nc" d <- cbind(read_binlist(file), read_compound(file, "chlor_a")) # bin_num nobs nscenes weights time_rec chlor_a_sum chlor_a_sum_squared #1 276025 3 1 1.732051 3019994880 0.1000465 0.005855022 #2 276026 7 2 3.449490 7046648832 0.2427317 0.017250253 #3 276027 10 2 4.000000 10066643968 0.3045491 0.023477813 # ... arrow::write_parquet(d, gsub("nc$", "parquet", file), compression = "zstd")
The L3 grid configuration is NROW = 4320.
(info <- rhdf5::h5ls(file)) Datatype: binDataType Datatype: binIndexType Datatype: binListType group name otype dclass dim 0 / level-3_binned_data H5I_GROUP 1 /level-3_binned_data BinIndex H5I_DATASET COMPOUND 4320 2 /level-3_binned_data BinList H5I_DATASET COMPOUND 1956874 3 /level-3_binned_data binDataDim H5I_DATASET FLOAT 0 4 /level-3_binned_data binIndexDim H5I_DATASET FLOAT 0 5 /level-3_binned_data binListDim H5I_DATASET FLOAT 0 6 /level-3_binned_data chlor_a H5I_DATASET COMPOUND 1956874 7 / processing_control H5I_GROUP 8 /processing_control input_parameters H5I_GROUP
The text was updated successfully, but these errors were encountered:
add example Parquet for issue #5
a1dd4d2
Justus (keewis) provided this code for use in Python to unpack the structured arrays
xarray.open_datatree("AQUA_MODIS.20241125.L3b.DAY.CHL.NRT.nc") #the way to transform the recarray into variables is: def extract_structured_variables(rec): return xr.Dataset({name: (rec.dims, rec.data[name]) for name in rec.dtype.names}) def process_recarrays(node): return xr.DataTree.from_dict({name: extract_structured_variables(var) for name, var in node.data_vars.items()}) def postprocess(tree): return tree.assign({"/level-3_binned_data": process_recarrays(tree["/level-3_binned_data"])})
Sorry, something went wrong.
No branches or pull requests
This Parquet file has a subset of an original BIN file.
https://github.com/hypertidy/L3bin/blob/master/inst/extdata/AQUA_MODIS.20241125.L3b.DAY.CHL.NRT.parquet
The BIN file from here: https://oceandata.sci.gsfc.nasa.gov/cgi/getfile/AQUA_MODIS.20241125.L3b.DAY.CHL.NRT.nc
R code to convert (using rhdf5) just the bin index and the chlor_a values:
The L3 grid configuration is NROW = 4320.
The text was updated successfully, but these errors were encountered: