Skip to content

Latest commit

 

History

History
62 lines (59 loc) · 104 KB

jrc_req.md

File metadata and controls

62 lines (59 loc) · 104 KB

JRC requirements

original doc: https://ilvo.sharepoint.com/:w:/s/HESoilWiseProject/EZVzNSxFiyBDkCPUZxmYfhUBYptMqe_oJZtOhGY-FWva8g?e=aO2SHv


id category JRC - requirement technical component related project partner comment JRC response SoilWise iteration 1 iteration 2-3 out of scope or to discuss further
1 Sources A definition of the repository would be good: its purpose, its contents, …; I assume the repository includes metadata. Would it also include data? Leonidas Liakos: SWR will in principle not duplicate and store data and knowledge assets. We distinguish between data and knowledge from persistent sources and non-persistent assets. Exceptions could be made in specific cases, such as when the persistence of high value data/knowledge assets could otherwise not be guaranteed (see 3.1 Storage strategy, at D7.2 Open Science and Data Management Plan) Also, Based on D1.3 Repository architecture. v1 6. SoilWise Repository will provide data download in two modes, “as is” or in an “interoperable way”. “As is” means the SWR is a broker connecting a user to a relevant data source. The “interoperable way” means the SWR is a mediator that converts external data into, e.g. INSPIRE compliant through an ETL-like tool, e.g. HALE Studio.
2 Sources SoilWise repository should be able to provide search capabilities for available results from R&I projects, especially those with funding from the Mission ‘A Soil Deal for Europe’, thus ensuring the legacy of the Mission. (would that be a search on the contents of the repository only)  Harvester - triple store - PySCW Leonidas Liakos: For ‘A Soil Deal for Europe’ at 6.2.2 of Usage Scenarios. Requirements, v1 D1.1 it is not very clear if the result of A Soil Deal for Europe’ will be included in data search. It is referred at the EUSO. “While benefiting from the outcomes of the Mission ‘A Soil Deal for Europe’, EUSO will ensure the legacy of knowledge and data produced by the Mission. In this context, SoilWise primarily focuses on enhancing the technical infrastructure supporting EUSO’s activities, and will develop and test a prototype for an open access, user friendly long term knowledge and data repository, taking due account of the requirements emerging from the evolvement of the EUSO. After the project lifecycle, SoilWise Repository will be integrated and further developed and maintained by EUSO”
3 Sources SoilWise will develop a data catalogue for EUSO to host/search/update the data of Soil Mission projects plus other data resources in EUSO (would this be the repository?)  PySCW
4 Sources Those results exist on source systems, like Cordis or ZENODO and other persistent repositories that are accepted by the EU (e.g. ESDAC). Harvester
5 Sources SoilWise repository should be able to connect with the source system and, if accessible, make requests on its available resources.   Harvester - data export/download
6 Sources SoilWise repository needs to be able to search and regularly harvest results from source systems. The results are categorised as data, metadata and knowledge from Soil Mission projects and others relevant to soil health R&I projects.  Harvester - triple store
7 Sources SoilWise repository needs to deal with results from persistent sources.   Harvester - live link checker
8 Sources For targeted R&I projects, there is a need for minimum data management requirements (e.g. data license, use of standards and locatable formats) to enable the injection of the results by SoilWise and the further use of the data results by JRC.   Governance
9 Sources There is a need for engagement with important R&I projects in a way to understand what is being created, what is coming when it is available and usable, and who will use the data and stakeholder analysis. Identification of the projects’ Data Management Plans and workflows is foreseen. It would be useful if EUSO together SoilWISE could informally collaborate with some EU projects in order to shape a DMP that is useful for SOILWISE and EUSO   Governance
10 Sources SoilWise needs to deal with the fact that “Projects need to make data available to EUSO, not to SoilWise”: Perhaps a required open license in the upcoming GA contracts is an option to mitigate the problem (e.g. CC BY 4.0 or MIT). Projects would need to make the data publicly available on persistent storage (e.g. ZENODO), accompanied by metadata, which also specifies access and use constraints  Governance
11 Data/metadata JRC considers EU-projects data to be raw data (Geotiff, raster, polygon), maps, and points describing geo-physical properties related to soils. JRC considers information as: to be statistics, methodologies, models,  syntheses, reports and conclusions.  Harvester - triple store
12 Data/metadata JRC expects a visualisation component, catalogueing, searching in SoilWise. Alignment of efforts with the EUSO Dashboard developments is an intention of both JRC and SoilWise.  PyCSW Leonidas Liakos: It is described in the functionalities that : “The SWR Catalogue will display a map preview of a resource (dataset/knowledge/service/…) from the source graphic/WMS/…, if applicable.” Marc Van Liedekerke: last sentence about allignment is out of place. Leonidas Liakos: SoilWISE is mainly a metadata repository of existing data sources. How SoilWISE will present and visualize unharmonized and not comparable data across EU. Who is the responsible to process and harmonize the individual datasets
13 Data/metadata Soil datasets are considered to be a high value datasets (HVD), but less than 50% of (national) soil datasets in Europe are currently (2023) accessible and much less arethere does not seem to be harmonisharmonization across EUed. Soil data model harmonisation is challenging because there is no legal obligation except INSPIRE, and data sharing culture is limited in the soil domain. However, SoilWise considers the foreseen legal obligations (such as INSPIRE) and follows the development of the EU Soil Health Law and its implementation.   Governance - HALE Marc Van Liedekerke: "Soil data model harmonisation is challenging because there is no legal obligation except INSPIRE" Do not understand the reasoning
14 Data/metadata SoilWise notes that no high-frequency data is expected. However, high volume of data (rasters of the order of GB) is expected. It can be also be that versions or updates of existing datasets is foreseen.It’s essential to deal with publishing permissions.  PostgreSQL Leonidas Liakos: The capacity of the SWR will be significantly limited (as described in the list of modular functionalities). That means that JRC is responsible for data storage? Marc Van Liedekerke: Should be indicated in the metadata. Leonidas Liakos: Authorisation is described in the list of functionalities from user story epics but there is no further details about the rights of users. In the Technical Documentation there are limited details about authorization on CRUD operations and not publicing priviledges"
15 Data/metadata The Repository will link with national data portals but will not store raw national data.  Harvester
16 Data/metadata Data should be stored in a (guaranteed) accessible permanent data storage: - Exchange/interoperable format. - Accompanied by metadata -template from JRC/SoilWise.   " Governance - data export/download - Hale studio - metadata validation
17 Storage SoilWise aims to show metadata of all relevant data sources.  Harvester - Metadata validation - PySCW
18 Storage Not all harvested data need to be stored in the repository.  PostgreSQL - Harvester Leonidas Liakos: This is also a question about the limits and the capacity of the SWR. It is described as a metadata catalog but also with capabilities to store/upload data (Table 6 Overview of technical components and their functionalities extracted from user story epic: see Cloud storage component, Store and retrieve data functionalities). But also I see in the description for the Functionality for first version of the Repository prototype that: SoilWise Repository will enable user to manually upload data for their on-the-fly processing within the SWR. Capacity of the SWR will be significantly limited; however, a demonstration of manual data upload and their processing, e.g. transformation of coordinate systems or measurements units. Whats the meaning of this functionality, especialy the given limited Capacity? Will it store data or it will be used as a data processing interface? Leonidas Liakos: SWR will keep metadata that point to source datasets and it will ensure if these sources are alive. Marc Van Liedekerke: Is the repository really going to store data?"
19 Storage Labelling data results as high-value data sets will rely on EUSO/ESDAC criteria, which are not identical to the HVD directive criteria.  More specifically, datasets from sources can should be in Zenodo or similar; if these are High-quality and excellent datasets can be promoted by EUSO. For critical / high-value resources, SoilWise will either link with them or store them.  Harvester - PostgreSQL Panos Ilias: related to the high-value datasets: JRC or not? Reply by Panos Panagos: EUSO Marc Van Liedekerke: related to HVC directive criteria - Which are these? Marc Van Liedekerke: related to ""if these are High-quality and excellent datasets can be promoted by EUSO. For critical / high-value resources, SoilWise will either link with them or store them."" - Don't understand"
20 Storage Non high value resources will, in principle, not be stored by SoilWise but under conditions in commonly accepted and suitable repositories like Zenodo or similar.  PostgreSQL - Harvester
21 Storage The SoilWise repository will not host model trains, AI pipelines etc. Will only store end products.  Harvester - Triple Store Fenny van Egmond: Why not? Can link to git sources where these are stored perhaps? Leonidas Liakos: Why not? They are part of the the reproducibility of the research."
22 Discovery and publishing services EUSO/JRC aims for a single access point of soil related data at EU level to ensure persistence in results.  Governance Leonidas Liakos: It is descrived in 4.3.3 Use case 3: policy makers in Usage Scenarios Requirements
23 Discovery and publishing services SoilWise must provide catalogue services, enabling the end users to discover available results from R&I projects.  PyCSW
24 Discovery and publishing services The catalogue services need to support the preview of the data.  PyCSW - map server Marc Van Liedekerke: I would not be so sure of that. Leonidas Liakos: It is described in Vision scenarios as Functionality in the Repository architecture document. In practical terms how the preview is going to be implement for a EU wide high resolution multitemporal dataset? Leonidas Liakos: Does this mean any preprocessing of the data? And if yes who will be rensponsible for this?"
25 Discovery and publishing services The ability for spatial queries is nice to have; for example, select a country or a region.  map server Leonidas Liakos: It was discussed in the “Kick-off architecture” of the Architecture working group https://miro.com/app/board/uXjVNfSujXA=/
26 Discovery and publishing services Thematic queries should be also available (e.g Erosion, Carbon, etc)  PyCSW - Triple Store
27 Discovery and publishing services Web mapping services to deliver data products is not preferable due to maintainability issues.  map server Panos Panagos: To be discussed as option. Leonidas Liakos: WMS are included in SWR Architecture as part of the Data preview and Publication API."
28 Discovery and publishing services Expose the available data or knowledge through API standards (conforms to the FAIR Interoperability principle)  PyCSW Leonidas Liakos: In the “Kick-off architecture” there is a component (Data publication) that makes provision for OGC API/STAC/WCS. APIs also described in the Technical Documentation: https://soilwise-documentation.pages.dev/apis/apis-intro/
29 Discovery and publishing services It’s essential to deal with publishing permissions.  Governace - Harvester Marc Van Liedekerke: Should be indicated in the metadata. Leonidas Liakos: Authorisation is described in the list of functionalities from user story epics but there is no further details about the rights of users. In the Technical Documentation there are limited details about authorization on CRUD operations and not publicing priviledges"
30 EUSO, ESDAC & SoilWIse EUSO is a project that is more than a data infrastructure; it is also a community, a policy evaluation and communication tool, a monitoring framework that includes LUCAS Soil, etc. SoilWise must contribute EUSO elements and support the EUSO dashboard, a communication tool that includes indicators and requires the provision of data flows from results also coming from R&I projects like SOMIMO AA. ESDAC now delivers the needed data flows, and SoilWise needs to contribute to this, considering the EUSO – ESDAC interlinkage. SoilWise should roll out the success of ESDAC or compete with ESDAC.  Governance Marc Van Liedekerke: related to ""SoilWise should roll out the success of ESDAC or compete with ESDAC."" - ??? Leonidas Liakos: reply: I think ESDAC will benefit from SoilWISE because it will embed it in a wider “family” of Soil Data and create a virtual mind map of Soil data interconnections. At least this is the main idea I think  "
31 EUSO, ESDAC & SoilWIse SoilWise needs to support EUSO, bringing together disparate datasets.  Harvester - Triple Store Leonidas Liakos: Is this implemented by Triple Store? At the functionalities of the SoilWise core product: The Triple store will be extended with links to knowledge assets from external resources, for the 1st iteration extracted from CORDIS. To a user, the knowledge will be made available, among others, based on interlinked metadata, i.e., metadata linked to relevant projects and project deliverables."
32 EUSO, ESDAC & SoilWIse SoilWise should develop a functional metadata catalogue serving to become a prototype of the EUSO ESDAC repository and make findable the data from Mission projects plus soil monitoring programmes.  PyCSW
33 EUSO, ESDAC & SoilWIse An adaptive/agile approach is needed, aiming at simplicity. The products or the set of services needs to be able to cover current needs. However, the architecture paradigm and the technological choices should also support expandability.  Marc Van Liedekerke: ""adaptive/agile approach"" - meaning? Leonidas Liakos: It is already defined that “The development of the SoilWise Repository will use Agile methodology, that means an iterative approach driven by user needs” (see at 7. Validation Framework in D.1.1) Leonidas Liakos: According to the description of the repository architecture, it will follow the agile approach and the support of the expandability: the architecture design will be continuously updated reflecting (1) new or updated requirements from the use cases stakeholder groups, (2) new details discovered during development, and (3) novel scientific and technology advances, all according to the product backlog and SoilWise Repository Rolling plan. Leonidas Liakos: That’s the meaning of the “Rolling Plan” (a methodology to take any changes during development into account)"
34 EUSO, ESDAC & SoilWIse There is a need for early development and deployment that can be used immediately with the projects that are now running and starting to secure the legacy of the Soil Mission. If this is not possible because the SoilWise project has not started yet, SoilWise is asked to suggest a temporary solution to support the achievement of this objective, like Zenodo or similar.  Panos Panagos: However, an early development of a first versionof the catalogue may take place in midle 2024.
35 EUSO, ESDAC & SoilWIse Consider that a Policy dashboard will also be created.  
36 EUSO, ESDAC & SoilWIse JRC Science Hub can be an entry point to EUSO that offers more possibilities and less strict rules.  PyCSW
37 EUSO, ESDAC & SoilWIse Similar to ESDAC, accessibility policy, usage and access constraints might not be needed.  Authentication - Authorization Marc Van Liedekerke: meaning? Leonidas Liakos: In other points it is recommended to impose an authentication and authorization policy…"
38 EUSO, ESDAC & SoilWIse EUSO uses smart visualisation ways to deliver info; however, it is desirable that SoilWise also supports inputs for information visualisation or feed graphics with relevant information (feed QLIK as an option).  PyCSW
39 EUSO, ESDAC & SoilWIse SoilWise should consider that the main target group of EUSO consists of the policymakers. Policymakers want to know:  - Where is the problem (this needs data)? - What can we do about it? (needs knowledge) - Is what we are doing making a difference? (needs monitoring and data) - Are things getting better? (needs monitoring) - What are the consequences of the (proposed or implemented) action, e.g., trade-offs and costs? " Governance
40 Knowledge Knowledge management can support policymaking.  Governance
41 Knowledge SoilWise should consider that knowledge is an important element of EUSO because it is needed to deliver solutions (What). SoilWise should consider that indexing relevant keywords, terms, and topics will support the knowledge element of EUSO. SoilWise should consider that knowledge is identifiable or exists on R&I projects on project Web sites content and project deliverables in document format, providing practical guidance and methodologies.  Harvester - Triple Store
42 Knowledge Knowledge = Tiers:  - Tier 1: Pan-European data sets (SoilWise primary target). - Tier 2: (derived) national/regional data sets - Tier 3: e.g. point data where we have explicit permission to publish. " Harvester - Triple Store Panos Panagos related to Tier 1 & 3 : Knowledge is not only data. Knowledge can be produced by legacy reports, deliverables, websites.
43 Knowledge Distinguish between JRC-in-house data/knowledge and beyond.  Harvester
44 Knowledge There might be IPR/GDPR issues.  Harvester Leonidas Liakos: For each data asset catalogued in the SWR, the metadata shall indicate which category (categories) of sensitive data this asset belongs to, if any, with a short rationale which parts of it fall under which specific category. Available information on licensing and IPR will be harvested where applicable and possible and will be integrated into the metadata. (See 2.6 Sensitive data and knowledge, at D7.2-Open Science and Data Management Plan)
45 Knowledge Knowledge (generally in the form of written material) -should be copyright/GDPR-free.  - Knowledge repository (e.g. EUSO, CORDIS, ZENODO?) - Metadata describing the context of the writing. " Harvester - metadata validation
46 AI/ML An added layer on top, like e.g. ChatGPT, is more a long-term possibility.  Large Language Model
47 AI/ML AI may provide a different perspective on the data through visualisation.  metadata validation
48 Critical measures of success Knowledge Management + Data Repo (potent combination).  Triple Store - PostgreSQL
49 Critical measures of success Search and find data.  PyCSW
50 Critical measures of success Integration with knowledge from other projects.  Harvester -Triple Store
51 Critical measures of success Compatibility with existing infrastructure, also considering operational costs and other non-functional requirements like scalability, performance etc.   Leonidas Liakos: can BDAP host the services of SWR? More about SWR Hardware Infrastructure here: https://main.soilwise-documentation.pages.dev/infrastructure/infrastructure-intro/
52 Critical measures of success Visualization. 
53 Critical measures of success Communication. 
54 Critical measures of success Integration with Soil Mission Platform and ESDAC datasets