Skip to content

Commit

Permalink
update use case
Browse files Browse the repository at this point in the history
  • Loading branch information
RodionLisch committed Dec 10, 2024
1 parent a2d78ac commit db00b3b
Show file tree
Hide file tree
Showing 3 changed files with 53,656 additions and 8 deletions.
89 changes: 89 additions & 0 deletions nfdinspector_tutorials/DDB_lido_config.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
{
"workID": {
"pattern": ""
},
"title": {
"inspect": true,
"unique": true,
"distinct_from_type": true,
"min_word_num": 1,
"max_word_num": 20
},
"category": {
"inspect": true,
"ref": true,
"patterns": {
"label": "",
"ref": ""
}
},
"object_work_type": {
"inspect": true,
"ref": true,
"patterns": {
"label": "",
"ref": ""
}
},
"classification": {
"inspect": false,
"ref": true,
"patterns": {
"label": "",
"ref": ""
}
},
"object_description": {
"inspect": false,
"unique": true,
"min_word_num": 20,
"max_word_num": 500
},
"materials_tech": {
"inspect": false,
"ref": true,
"differentiated": false
},
"object_measurements": {
"inspect": false
},
"event": {
"inspect": false,
"ref": true
},
"subject_concept": {
"inspect": false,
"ref": true,
"min_num": 3
},
"resource": {
"inspect": true
},
"record_type": {
"inspect": true,
"ref": true,
"patterns": {
"label": "",
"ref": ""
}
},
"repository_name": {
"inspect": true,
"ref": true
},
"record_source": {
"inspect": true,
"ref": true
},
"record_rights": {
"inspect": true,
"ref": true,
"patterns": {
"label": "",
"ref": ""
}
},
"record_info": {
"inspect": true
}
}
96 changes: 88 additions & 8 deletions nfdinspector_tutorials/LIDO_use_case.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@
},
{
"cell_type": "code",
"execution_count": 3,
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
Expand All @@ -34,7 +34,7 @@
},
{
"cell_type": "code",
"execution_count": 9,
"execution_count": 2,
"metadata": {},
"outputs": [
{
Expand Down Expand Up @@ -75,7 +75,7 @@
" 'record_info': {'inspect': True}}"
]
},
"execution_count": 9,
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
Expand All @@ -97,10 +97,7 @@
"When inspecting the configuration file, you can see the various setting options of the NFDInspector. In addition to the option to check whether certain data fields have entries, properties such as uniqueness within a data set, minimum and maximum length of entries, presence of referencing or ID can be checked. It is also possible to check entries for formal correctness using regular expressions.<br>\n",
"In this case, the metadata should be checked for compatibility with the requirements of the German Digital Library. These requirements are listed on the __[DDB website](https://wiki.deutsche-digitale-bibliothek.de/display/DFD/Anforderungen+an+die+Lieferdaten)__ and can be viewed there. The minimum requirements stipulate that eight metadata elements must be present:\n",
"\n",
"- Data record identifier\n",
" - the identifier has to be stable\n",
" - the identifier must not contain any spaces\n",
"- Data partner Identifiere\n",
"- Data partner Identifier\n",
" - unique and persistent identifier for institution supplying the dataset to the DDB\n",
" - The identifier should preferably be an International Standard Identifier for Libraries and Related Organizations (ISIL), ISO 15511, assigned by the German ISIL Agency and Sigelstelle at the Staatsbibliothek zu Berlin. For museums, the ISIL identifiers are assigned by the Institute for Museum Research Berlin.\n",
"- Link to the digital object\n",
Expand All @@ -113,7 +110,90 @@
"- Media type\n",
"\n",
"<br>\n",
"A configuration file can map the requirements. To do this, we modify the standard configuration as follows. The already exported file can either be edited and read in again, or the configuration within the LIDOInspector instance can be changed."
"A configuration file can map the requirements. To do this, we modify the standard configuration as follows. The already exported file can either be edited and read in again, or the configuration within the LIDOInspector instance can be changed. In this case we modify the standard configuration file to fit our needs. Therefore, we disable all other checks but the ones, specified above."
]
},
{
"cell_type": "code",
"execution_count": 32,
"metadata": {},
"outputs": [],
"source": [
"# read the modified configuration file\n",
"lido_inspector.config_file('../nfdinspector_tutorials/DDB_lido_config.json')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now we can ingest the LIDO-xml files, we want to inspect by passing the containing directory to ``lido_inspector`` and start the inspection according to our specifications in the config file."
]
},
{
"cell_type": "code",
"execution_count": 34,
"metadata": {},
"outputs": [],
"source": [
"lido_inspector.read_lido_files(files_path='../nfdinspector_tutorials/LIDO_xml/stereo_montandok')\n",
"lido_inspector.inspect()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"After the Inspection ist done, the results are stored under ``LIDOInspector.inspections`` and can be accessed from there in JSON format by calling the function ."
]
},
{
"cell_type": "code",
"execution_count": 36,
"metadata": {},
"outputs": [],
"source": [
"lido_inspector.to_json(file_path='../nfdinspector_tutorials/results/results_DDB.json', indent=4)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{'inspect': True, 'ref': True}\n"
]
}
],
"source": [
"# reading a json file into a dict\n",
"with open('../nfdinspector_tutorials/lido_config.json', 'r') as file:\n",
" config_dict = json.load(file)\n",
"\n",
"print(config_dict['record_source'])"
]
},
{
"cell_type": "code",
"execution_count": 31,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"['missing label ()', 'missing reference/ID ()']\n"
]
}
],
"source": [
"lido_inspector.read_lido_file('..\\\\nfdinspector_tutorials\\\\LIDO_xml\\\\stereo_montandok\\\\23330 copy.xml')\n",
"result = lido_inspector.inspect_repository_name(lido_inspector.lido_objects[0])\n",
"print(result)"
]
}
],
Expand Down
Loading

0 comments on commit db00b3b

Please sign in to comment.