-
-
Notifications
You must be signed in to change notification settings - Fork 144
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Setting ignore_attribute with edit_dataset only uses last attribute #1289
Comments
The XML send by the client code is
Not sure if that is correct? @joaquinvanschoren ? |
Strange, when I query the production database, for dataset SELECT * FROM `data_feature` WHERE `did`=45686 |
Looks like a bug in the API: This seems to overwrite the previous values in the same request. It also looks like every call replaces the columns to be ignored, it doesn't add them. As a workaround, passing all values at once (comma-separated string) should work (but haven't tested it). @PGijsbers what do you think? Is it worth fixing this in API v1 or do a workaround now and fix this in API v2? |
Eh, could it be that the /data/edit endpoint only changes the dataset table, not the data_feature table? Looks like it: |
The @amueller Does Joaquin's suggested workaround work? @joaquinvanschoren If the work-around works, we could hotfix |
Do you mean passing all values at once as a string? I tried that before opening the issue, the server-side validation didn't seem to like it the way I did it. There might be another way, though? |
It would be great to have a work-around for this, I'd really like to use this dataset. |
For me the workaround seems to work? openml.org/d/45705 import openml
ds = openml.datasets.get_dataset("cylinder-bands", version=2)
new_did = openml.datasets.fork_dataset(data_id=ds.id)
openml.datasets.edit_dataset(new_id, ignore_attribute='timestamp,cylinder_number,job_number') after removing cache:
Please try again with the provided script, perhaps there were other formatting errors when you tried the workaround. If that still doesn't work, please provide the error message. And also the dataset id of the dataset that you tried to modify (i.e., your "fork" ( Running on a dev version of |
I think I had spaces after the comma, that might have been the issue. Thank you! Version two is my fork IIRC :) |
FYI it seems that if you fork a dataset, it keeps the owner by default. I'm not sure if that's intentional? |
I am not sure what you mean by that. I see multiple uploaders: |
Hm ok so this is the last person that edited it? Because 45705 was the one I created and it's now "uploaded" by you. |
I am a little confused. Are you saying that the "uploader" for a specific dataset id changed? E.g., 45705 was first marked as "uploaded by you" and now "uploaded by me"? Because I don't think that's supposed to happen. |
So I tried to create a new version of
cylinder-bands
because of openml/openml-data#59However, that seems to have replaced the
ignore_attribute
just with"job_number"
as you can see here:https://www.openml.org/api/v1/json/data/45686
Opening this here since I used the Python interface, but the Python code looks pretty easy, so maybe it's an issue in the backend?
The text was updated successfully, but these errors were encountered: