Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migration to new API #131

Open
24 tasks
sebffischer opened this issue Jan 9, 2024 · 0 comments
Open
24 tasks

Migration to new API #131

sebffischer opened this issue Jan 9, 2024 · 0 comments
Labels

Comments

@sebffischer
Copy link
Member

sebffischer commented Jan 9, 2024

The new OpenML API is currently about 50% done, and there will be some changes that are partly listed here:
https://openml.github.io/server-api/migration/.
The new API is hosted here: https://test.openml.org/py/docs

Filter specifications are now part of the body.
Migration to the new API happens in this branch: https://github.com/mlr-org/mlr3oml/tree/feat/new-api

Overview table of API requests that are currently supported by mlr3oml and their status (i.e. whether it works in the branch with the new API).

  • Download Data Description:
    • tags might be different, because datasets are now annotated by LLMs and the database from the new server is before that date and does not have these new tags. This is not an issue
    • The format of the processing_time is now slightly different so had to be adjusted.
    • Also some numbers are now correctly encoded as integers and not strings anymore.
    • different arff link seems to be a bug
    • different parquet_url is intended as the buckets are restructured
    • We need to partially add some conversions but could also remove some. E.g. empty character vectors are returned as list() for length 1, but as character(1) for length 1.
    • In the future, the ignore_attribute, row_id_attribute and target_attribute should never be NULL, but always character vectors, so expect_oml_data() can change the test from expect_character(..., null.ok = TRUE) to expect_character(..., null.ok = FALSE)
  • Download Task Description
    • the $input field now has a different structure. This has to be addressed.
    • The task_name field is now name
  • Download Task Splits: Take new structure of task description into account
  • Download Flow Description
  • Download Run Description
  • Download Predictions
  • List Data
  • List Tasks
  • List Flows
  • List Setups
  • List Evaluations
  • List Measures
  • Upload Dataset
  • Upload Task
  • Upload Collection
  • Download Data (arff)
    • Nothing should change here
  • Download Data (parquet)
    • The url changes but this should not affect the code as we retrieve it from the metadata description
  • Download Data qualities
    • Nothing should change here.

Other stuff that will / might break:

  • return(NA_integer_)
    (These "No results" custom oml codes will soon be one html header code and also not be an error any more).

  • if (response$oml_code %in% c(107L)) {

    (error code might be adjusted)

  • This needs to be updated as this information is not part of the query anymore but of the body

    filters = imap_chr(filters, function(x, name) {

  • if (response$oml_code %in% c(107L)) {
    Error code 107 will probably not be there anymore

  • Increment cache version when everything is done

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant