Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

more convenient representations of required/optional task id values #16

Open
elray1 opened this issue Nov 11, 2022 · 5 comments · Fixed by #17
Open

more convenient representations of required/optional task id values #16

elray1 opened this issue Nov 11, 2022 · 5 comments · Fixed by #17
Labels
enhancement New feature or request

Comments

@elray1
Copy link
Contributor

elray1 commented Nov 11, 2022

Is there a way we could simplify a situation where there are multiple overlapping sets of values for task ids?

Examples:

Location:

us: “US”
us_states”: [“US”, “01”, … ]
us_states_counties: above plus all the counties

Horizon:

horizon_1: [1]
horizon_2: [1,2]
horizon_3: [1,2,3]
…
horizon_52: [1, 2, 3, …, 52]

Ideas for a better way

Idea 1: Can we simplify by specifying data type and range rather than providing lists?

Ex. 1: horizon

Rather than horizon: [1, 2, 3, …, 52]
Something like horizon: {type:integer, min:1, max:52]

Ex. 2: origin_date
Rather than origin_date: [“2020-04-07”, “2020-04-14”, …, “2022-12-05”]
Something like origin_date: {type:date, min: “2020-04-07”}, perhaps with some way to specify weekday?

Idea 2: Some way to specify concatenation of arrays within the json format?

@elray1 elray1 added the enhancement New feature or request label Nov 23, 2022
@elray1 elray1 transferred this issue from hubverse-org/hubDocs Nov 23, 2022
@elray1
Copy link
Contributor Author

elray1 commented Jul 13, 2023

I am reopening this issue as I think the ideas here are still not implemented?

@elray1 elray1 reopened this Jul 13, 2023
@annakrystalli
Copy link
Member

Sorry deleted previous comment as misread the original suggestion.

Some of this might be possible but will likely take a lot of effort to work through the implications and eventually implement.

@elray1
Copy link
Contributor Author

elray1 commented Jul 13, 2023

makes sense. I think of this as a low priority enhancement, but just wanted to keep the record of the idea alive :)

@annakrystalli
Copy link
Member

annakrystalli commented Jul 18, 2023

So while working on hubValidations I've been digging more deeply into the way the European hub validates values and data types and they use a cool trick which we can't currently.

While we use our schema to validate config files which in turn specify arrays of valid values for each task ID which we use to validate submission files, the European hub uses schema files to directly validate submitted data (by turning data.frame data into JSON). https://github.com/covid19-forecast-hub-europe/HubValidations/blob/13a1ad4611b3ba0113439c53245cb4b3826e5d2f/R/validate_model_data.R#L59-L72

One side effect of this is the flexibility in how they can specify valid values in each task ID, i.e. using ranges, patterns etc, which seem relevant to this issue. The downside is, I'm not sure how you could generalise/validate their schema (config file) the way we have (effectively the effort & potential complications I feared for the proposal in this issue apply), but thought I'd mention it here as a point of interest.

@sbfnk
Copy link

sbfnk commented Jul 19, 2023

Tagging @Bisaloo who designed and implemented the validation for the European hub for any additional thoughts/comments on the topic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants