A Python-based tool for importing DCAT datasets in xml/rdf or ttl format into the I14Y platform of the Swiss Federal Statistical Office (BFS).
- Import DCAT datasets from xml/rdf or ttl files to I14Y API
- Supported properties for dcat.Dataset:
Property | Requirement level |
---|---|
dct:title | mandatory |
dct:description | mandatory |
dct:accessRight (chosen from: PUBLIC, NON_PUBLIC, CONFIDENTIAL, RESTRICTED) | mandatory |
dct:publisher (stated in config.py) | mandatory |
dct:identifier | mandatory |
dct:issued | optional |
dct:modified | optional |
dcat:landingPage | optional |
dcat:keyword | optional |
dct:language | optional |
dcat:contactPoint | optional |
documentation (foaf:page) | optional |
schema:image | optional |
dct:temporalCoverage | optional |
dcat:temporalResolution | optional |
frequency (dct:accrualPeriodicity) | optional |
dct:isReferencedBy | optional |
dct:relation | optional |
spatial/geographical coverage (dct:spatial) | optional |
dct:conformsTo | optional |
dcat:theme | optional |
dcat:version | optional |
adms:versionNotes | optional |
prov.qualifiedAttribution and prov.qualifiedRelation are not supported automatically, you can add those informations manually on I14Y.
- Supported properties for dcat.Distribution:
Property | Requirement level |
---|---|
dct:title (if not stated, set automatically to 'Datenexport') | mandatory |
dct:description (if not stated, set automatically to 'Export der Daten') | mandatory |
dcat:accessURL | mandatory |
dcat:downloadURL | optional |
dct:license | optional |
dct:issued | optional |
dct:modified | optional |
dct:rights | optional |
dct:language | optional |
schema:image | optional |
dcat:spatialResolutionInMeters | optional |
dcat:temporalResolution | optional |
dct:conformsTo | optional |
dcat:mediaType | optional |
dct:format | optional |
dct:packageFormat | optional |
spdx:checksum | optional |
dcat:byteSize | optional |
- Python 3.8+
- pip package manager
- Clone this repository:
git clone [repository-url]
cd import_rdf_datasets
- (Optional but recommended) Create and activate a virtual environment:
# Create virtual environment
python -m venv venv
# Activate virtual environment
# On Windows:
venv\Scripts\activate
# On macOS/Linux:
source venv/bin/activate
- Install dependencies:
pip install -r requirements.txt
- Configure the application:
- Edit
src/config.py
with your I14Y API token, organization ID and right file format ("xml" or "ttl")
- Edit
- Log in on the interoperability platform. Copy the token clicking on the profile symbol. Fill in the token in the file config.py. Also provide the identifier of your organsation.
- Place your RDF files in the data/ folder (.xml, .rdf or .ttl)
- Run the import script:
python src/import_datasets.py
The script will process each row and display real-time progress and error messages in the terminal.
import_rdf_datasets/
├── data/
│ └── datasets.xml
├── src/
│ ├── config.py
│ ├── dcat_properies_utils.py
│ ├── import_datasets.py
│ └── mappings.py
├── requirements.txt
└── README.md
Please ensure any pull requests or contributions adhere to the following guidelines:
- Keep the code simple and well-documented
- Follow PEP 8 style guidelines
- Include appropriate error handling
- Test thoroughly before submitting