-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bulk Delete: Ghost filesets #2530
Comments
Out of the 22 filesets that were broken, I was able to delete 3n204545p through vt150r99h (1 - 13) from the database successfully. t722h937w through b2773w401 have something particularly broken about them. I am able to update them through the command line, save them, and it persists the updated information. I can use LDP to request a resource and get a graph from fedora and it looks perfectly acceptable. However, attempting to delete them from fedora results in them not wanting to be deleted. We tried through CURL commands, using ActiveFedora, LDP and the fedora UI to try and delete them, but they were unable to be removed. We might need to have a deeper discussion about these items in the database. Ive done all I can within a reasonable amount of time to try and remove them. |
FWIW, that lines up exactly with the two different groups described at the top of the ticket. The 9 particularly broken ones are the same ones that I "deleted-but-not-really" in the UI. |
QA: I have confirmed that PIDs 1-13 are no longer found in Solr, while 14-22 still have apparitions there. So QA pass for the scope of the first group. @shieldsb Since the mystery remains but doesn't have any serious known consequences, can we keep this ticket open but with a low priority and/or icebox label? |
Got it. I'll update the ticket label and board @carakey |
Descriptive summary
Please bulk delete the 22 fileset objects whose PIDs are listed in the attached CSV file. These are not accessible from the front end, so cannot be deleted by a standard UI process.
Documentation
A number of problem fileset objects were identified in the preservation assessment format inventory. One subset of these are 13 filesets that exist in Solr with minimal metadata, but can't be pulled up with URLs (500 errors on the fileset show pages; 404 errors on the direct download links). I'm referring to these as "ghost filesets."
A separate subset of 31 problem filesets, grouped as "Ingest errors with duplicate functional filenames elsewhere" on ticket #2491, did have functional fileset show pages. However after attempting to delete these fileset objects from the UI using the Delete button, nine out of 31 fileset objects persisted in Solr and now show the same characteristics as the first group -- they are in Solr but have minimal metadata, no characterization information, no parents, and cannot be viewed in SA. In other words, they became ghost filesets, too, which suggests the first group were likely similarly deleted but somehow stuck around in Solr. I haven't been able to identify a pattern distinguishing between the 22 filesets that were successfully deleted vs the 9 ghosts.
Related work
This is an offshoot of #2491.
The text was updated successfully, but these errors were encountered: