-
-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make it possible to Import IIIF collections #90
Comments
Thanks @Abbe98 this sounds like a really interesting use case. I can definitely see how we might have some specific tools/functions to support IIIF but I'm less sure what these might look like in reality. Working from the start, could you say a bit more about how you'd see this working? For example the example collection you have posted, contains a set of further collections, which contain a mixture of more collections and manifests. What would the resulting OpenRefine project look like? What might be a typical data cleaning task within the resulting project? |
I would intuitively keep this issue in the CommonsExtension repository unless there are things OpenRefine's side that need to be changed for such an importer to be implemented there. |
@ostephens in my opinion, it would only fetch the data in manifests and each manifest would become a record, as one manifest is generally representing a single media file.
Yeah, this probably shouldn't be in core. Not sure the CommonsExtension is the right place either considering that the features aren't dependent on each other. |
Checking my understanding, if the user were to specify the root URL "https://lbiiif.riksarkivet.se/collection/kartor-och-ritningar" the importer would be required to retrieve the content of the "items" array found at that URL and then:
Through this process the importer would work through all Collections and Manifests that are discoverable from the original root URL and eventually end up with a project that contains all the Manifests that were found? Have I understood the intention correctly? |
@ostephens yes. I guess one might want to implement some optional limits(max x number of levels, max u number of records, etc). |
Thanks @Abbe98. I'm not a IIIF expert, but I think it's allowed for collections to include items that are from anywhere online? So we could be ending up doing some extremely large-scale crawling here? (this could also be limited in some way of course - such as allowing the user to specify a domain as well as number of levels) |
@Abbe98 in the case of finding a manifest, how would you want the information in the manifest stored in an OpenRefine row? To take an example from your root we have the collection ID
What would the row/record stored in OpenRefine look like in this case? |
I think it would be great to discuss with the (very active) IIIF community how they'd like this to be built, and maintained over the longer term. |
I agree with @wetneb that this should be moved to a more appropriate repository. The example collection manifest looks like JSON-LD, so it's already supported by OpenRefine, but with the limitations inherent in mapping tree-shaped (JSON & XML) formats to a rectangular grid. The universe of JSON applications is obviously way too big to be building specific support into OpenRefine for each of them. |
So I have transfered it to the CommonsExtension repo, where it seems to be indeed duplicating #19 - not sure which one people want to keep? |
I'm not sure if this is the right place after all. It may very well be that the IIIF community would mainly prefer to use IIIF integration in OpenRefine for generic data cleaning (not for Wikimedia Commons import)! IMO they are the ones to say/decide. I would strongly suggest a bit of user research, asking potential users about their primary predicted use cases. |
My intent when I created this issue had nothing to do with Wikimedia Commons. While I too agree that it should be in a separate extension(I believe half of the core extensions should be moved from core...) but thought high-level extension request lived in core's issue tracker. |
I have created a wiki page to list some extension requests and listed IIIF there: |
IIIF and the IIIF Presentation API are used by many GLAM institutions and the ability to import records IIIF Collections would greatly reusers who wish to clean GLAM data or users of the Commons extension.
Proposed solution
Given the collection root URL, an importer would traverse its content and fetch data from the various IIIF manifests in it.
Additional context
The text was updated successfully, but these errors were encountered: