You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Errors produced during the arrow::open_dataset() from problems involving anything from columns in schema provided not matching columns in data (e.g. when trying to open data that still had type and type_id columns when we changed to output_type and output_type_id) to mis-specification of field data type (e.g. trying to cast double or character column to integer) produces the same wildly uninformative error.
For example, here I try to cast character field output_type as int32 data type.
library(hubUtils)
library(arrow)
#> #> Attaching package: 'arrow'#> The following object is masked from 'package:utils':#> #> timestampmodel_output_schema<- schema(
origin_date= date32(),
target= string(),
horizon= int32(),
location= string(),
output_type= int32(),
output_type_id= string(),
value= int32(),
model_id= string()
)
model_output_dir<- system.file("testhubs/simple/model-output", package="hubUtils")
mod_out_con<- connect_model_output(model_output_dir, file_format="csv",
schema=model_output_schema)
#> Error in `arrow::open_dataset()` at hubUtils/R/connect_model_output.R:32:8:#> ! Invalid: No non-null segments were available for field 'model_id'; couldn't infer type#> Backtrace:#> ▆#> 1. ├─hubUtils::connect_model_output(...)#> 2. └─hubUtils:::connect_model_output.default(...) at hubUtils/R/connect_model_output.R:17:4#> 3. └─arrow::open_dataset(...) at hubUtils/R/connect_model_output.R:32:8#> 4. └─base::tryCatch(...)#> 5. └─base (local) tryCatchList(expr, classes, parentenv, handlers)#> 6. └─base (local) tryCatchOne(expr, names, parentenv, handlers[[1L]])#> 7. └─value[[3L]](cond)#> 8. └─arrow:::augment_io_error_msg(e, call, format = format)#> 9. └─rlang::abort(msg, call = call)
#> Error in `arrow::open_dataset()` at hubUtils/R/connect_model_output.R:32:8:
#> ! Invalid: No non-null segments were available for field 'model_id'; couldn't infer type
has sent us on many a wild goose chase while not providing any useful pointers to actual problem and will likely be even more confusing to downstream hub users.
Our options are:
Report the poor error handling to arrow and wait for a resolution in the package itself.
Try and capture, analyse and produce are own messages within hubUtils.
I feel we should definitely report the behaviour whatever else we decide.
While I'm leaning towards 2 out of principle that our functions are currently resulting in really unhelpful error messages, it may not be that straight forward to implement.
The text was updated successfully, but these errors were encountered:
Errors produced during the
arrow::open_dataset()
from problems involving anything from columns in schema provided not matching columns in data (e.g. when trying to open data that still hadtype
andtype_id
columns when we changed tooutput_type
andoutput_type_id
) to mis-specification of field data type (e.g. trying to castdouble
orcharacter
column tointeger
) produces the same wildly uninformative error.For example, here I try to cast
character
fieldoutput_type
asint32
data type.Created on 2023-06-13 with reprex v2.0.2
The error thrown:
has sent us on many a wild goose chase while not providing any useful pointers to actual problem and will likely be even more confusing to downstream hub users.
Our options are:
hubUtils
.I feel we should definitely report the behaviour whatever else we decide.
While I'm leaning towards 2 out of principle that our functions are currently resulting in really unhelpful error messages, it may not be that straight forward to implement.
The text was updated successfully, but these errors were encountered: