Skip to content

Small OAI Harvester for iconographic material from the Bibliotheca Hertziana’s Fototeca

Notifications You must be signed in to change notification settings

dvstudies/OAI_harvester

Repository files navigation

OAI_harvester

The present OAI harvester is an implementation of a harvester to collect iconographic material from the Online Catalogue of the Photographic Collection (Fototeca) at the Max-Planck Center for the History of Art and Architecture – Bibliotheca Hertziana (Biblhertz).

The repository contains the following files:

  • requirements.txt All the required modules that have to be installed in order to run Biblhertz_OAI_harvester.ipynb. Please run pip install -r requirements.txt
  • Biblhertz_IMG_harvester.py A python script of an image harvester. The latter is based on the database collected with the Biblhertz_OAI_harvester.py. In order to run the script, the nature of the images to be downloaded has to be specified. The script takes the following parameters:
--type [Zeichnung Text Ort ...]
--artist [Caravaggio Bernini ...]
--title []
--date_begin [1560]
--date_end [1760]
--medium [Marmor Öl Papier ...]
--all [True | False]
  • Biblhertz_OAI_harvester.py A python script version of an OAI harvester which queries the [https://oai.biblhertz.it/foto/oai-pmh] url, retrieves all the objects with identifiers '08######' and stores their corresponding information in a biblhertz.db database. To run on a terminal python Biblhertz_OAI_harvester.py
  • Biblhertz_OAI_harvester.ipynb A python notebook version of an OAI harvester which queries the [https://oai.biblhertz.it/foto/oai-pmh] url, retrieves all the objects with identifiers '08######' and stores their corresponding information in a biblhertz.db database
  • Biblhertz_foto_retrieval.ipynb A first draft to collect digital images based on a local .xml file of the online database. Will be deleted soon

How to use the harvester

First run Biblhertz_OAI_harvester.py in your terminal. The database biblhertz.db will be created in your current folder. Then, if you want to collect images on top of the metadata, run the Biblhertz_OAI_harvester.py script. A folder called biblhertz_images will be created in your current directory.

python Biblhertz_OAI_harvester.py
python Biblhertz_IMG_harvester.py --type Zeichnung Malerei --date_begin 1560

Specific Documentation

The Fototeca provides documentation about its OAI system on its GitHub account 'hertzphoto':

The data in the Online catalogue is organized according to the Marburger Informations-, Dokumentations-, und Adminisstrations-System (MIDAS) (Bove, Heusigner and Kailus. 2001).

Reference project:

The Städel Museum provides best-practice documentation about their OAI interface

The Fondazione Federico Zeri – Università di Bologna provides a good example of query system on through as web app

About

Small OAI Harvester for iconographic material from the Bibliotheca Hertziana’s Fototeca

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published