diff --git a/docs/index.html b/docs/index.html index aa99eca..8fd25ab 100644 --- a/docs/index.html +++ b/docs/index.html @@ -20763,33 +20763,83 @@
+ The following guidelines are designed to help determine the most effective way to structure DCAT + distributions, whether as a single file, a multi-file package, or multiple distributions. The choice depends + on the dataset's characteristics, user needs, and the data's intended use. Consider these guidelines to ensure + your distributions are user-friendly, accessible, and align with best practices in data management. +
++ Single-File Distribution: Ideal for datasets that are cohesive and standalone, typically + encapsulated in a single format like CSV or XML. This approach is beneficial for smaller or comprehensive + datasets, simplifying access and use. The key is to choose a file format that effectively represents all + necessary data. +
++ Multi-File Packaged Distribution: Essential for complex datasets, such as ArcGIS + shapefiles, which require multiple interdependent files. Packaging related files together is useful for + large or component-rich datasets. It's crucial to include all essential components and ensure the package + facilitates easy download and usage. +
++ Multiple Distributions in a Dataset: Suitable for datasets that can be logically + segmented or offered in different formats. This method allows targeted access to specific data parts and + enables selective updating. Clear documentation of each distribution is important for user navigation. +
++ When selecting a distribution format, it is important to consider factors such as the interdependence of + files, the ease of user accessibility, the size and downloadability of the data, the frequency of updates, and + the diversity of formats required. A thoughtful approach to these criteria will help in creating a + distribution strategy that is both practical for data providers and beneficial for end-users, enhancing the + overall effectiveness of data sharing and utilization. +
+- This section focuses on the properties central to the file-centric aspects of dcat:Distribution. + This section focuses on the properties central to the file-centric aspects of dcat:Distribution. These properties are crucial for ensuring datasets are accessible and usable in their practical forms, addressing the aspects of data encoding, structure, packaging, presentation, media type, and language.
dcat:mediaType
, particularly when aligning with file formats recognized by central authorities.
The role of dcterms:format
is to offer a detailed description of the distribution's file format
or physical medium. For instance, in the geospatial domain, this could include formats like “Shapefile” or
@@ -20797,12 +20847,23 @@ spdx:checksumValue
property. To indicate
- the algorithm used for generating the checksum, use the property spdx:algorithm with URIs defined in the SPDX specification, such as spdx:checksumValue
property. To indicate
+ the algorithm used for generating the checksum, use the property spdx:algorithm with URIs defined in the SPDX specification, such as spdx:checksumAlgorithm_sha1
, spdx:checksumAlgorithm_sha256
,
or spdx:checksumAlgorithm_sha512
,
depending on the algorithm employed.
ucs2 | 16-bit fixed size Universal Character Set, based on ISO/IEC 10646 | ISO-10646-UCS-2 |
ucs4 | 32-bit fixed size Universal Character Set, based on ISO/IEC 10646 | ISO-10646-UCS-4 |
utf7 | 7-bit variable size UCS Transfer Format, based on ISO/IEC 10646 | UTF-7 |
utf8 | 8-bit variable size UCS Transfer Format, based on ISO/IEC 10646 | UTF-8 |
utf16 | 16-bit variable size UCS Transfer Format, based on ISO/IEC 10646 | UTF-16 |
8859part1 | ISO/IEC 8859-1, Information
@@ -20895,7 +20962,7 @@ Distribution Metadata8-bit single byte coded graphic character sets - Part 1 : Latin alphabet No.1 |
ISO-8859-1 |
8859part2 | ISO/IEC 8859-2, Information
@@ -20904,7 +20971,7 @@ Distribution Metadata8-bit single byte coded graphic character sets - Part 2 : Latin alphabet No.2 |
ISO-8859-2 |
8859part3 | ISO/IEC 8859-3, Information
@@ -20913,7 +20980,7 @@ Distribution Metadata8-bit single byte coded graphic character sets - Part 3 : Latin alphabet No.3 |
ISO-8859-3 |
8859part4 | ISO/IEC 8859-4, Information
@@ -20922,7 +20989,7 @@ Distribution Metadata8-bit single byte coded graphic character sets - Part 4 : Latin alphabet No.4 |
ISO-8859-4 |
8859part5 | ISO/IEC 8859-5, Information
@@ -20931,7 +20998,7 @@ Distribution Metadata8-bit single byte coded graphic character sets - Part 5 : Latin/Cyrillic alphabet |
ISO-8859-5 |
8859part6 | ISO/IEC 8859-6, Information
@@ -20940,7 +21007,7 @@ Distribution Metadata8-bit single byte coded graphic character sets - Part 6 : Latin/Arabic alphabet |
ISO-8859-6 |
8859part7 | ISO/IEC 8859-7, Information
@@ -20949,7 +21016,7 @@ Distribution Metadata8-bit single byte coded graphic character sets - Part 7 : Latin/Greek alphabet |
ISO-8859-7 |
8859part8 | ISO/IEC 8859-8, Information
@@ -20958,7 +21025,7 @@ Distribution Metadata8-bit single byte coded graphic character sets - Part 8 : Latin/Hebrew alphabet |
ISO-8859-8 |
8859part9 | ISO/IEC 8859-9, Information
@@ -20967,7 +21034,7 @@ Distribution Metadata8-bit single byte coded graphic character sets - Part 9 : Latin alphabet No.5 |
ISO-8859-9 |
8859part10 | ISO/IEC 8859-10, Information
@@ -20976,7 +21043,7 @@ Distribution Metadata8-bit single byte coded graphic character sets - Part 10 : Latin alphabet No.6 |
ISO-8859-10 |
8859part11 | ISO/IEC 8859-11, Information
@@ -20985,7 +21052,7 @@ Distribution Metadata8-bit single byte coded graphic character sets - Part 11 : Latin/Thai alphabet |
ISO-8859-11 |
8859part13 | ISO/IEC 8859-13, Information
@@ -20994,7 +21061,7 @@ Distribution Metadata8-bit single byte coded graphic character sets - Part 13 : Latin alphabet No.7 |
ISO-8859-13 |
8859part14 | ISO/IEC 8859-14, Information
@@ -21003,7 +21070,7 @@ Distribution Metadata8-bit single byte coded graphic character sets - Part 14 : Latin alphabet No.8 (Celtic) |
ISO-8859-14 |
8859part15 | ISO/IEC 8859-15, Information
@@ -21012,7 +21079,7 @@ Distribution Metadata8-bit single byte coded graphic character sets - Part 15 : Latin alphabet No.9 |
ISO-8859-15 |
8859part16 | ISO/IEC 8859-16, Information
@@ -21021,58 +21088,58 @@ Distribution Metadata8-bit single byte coded graphic character sets - Part 16 : Latin alphabet No.10 |
ISO-8859-16 |
jis | japanese code set used for electronic transmission | JIS_Encoding |
shiftJIS | japanese code set used on MS-DOS machines | Shift_JIS |
eucJP | japanese code set used on UNIX based machines | EUC-JP |
usAscii | United States ASCII code set (ISO 646 US) | US-ASCII |
ebcdic | IBM mainframe code set | IBM037 |
eucKR | Korean code set | EUC-KR |
big5 | traditional Chinese code set used in Taiwan, Hong Kong of China and other areas | Big5 |
GB2312 | simplified Chinese code set | GB2312 |
The quality of a dataset plays a pivotal role in shaping trust, reusability, and the overall performance of - applications that rely on it. As a result, it is imperative to integrate data quality information seamlessly - into - both the data publishing and consumption processes. This inclusion allows for a thorough evaluation of a - dataset's - quality, thereby determining its suitability for a particular application.
- -Thorough documentation of data quality significantly streamlines the dataset selection process, enhancing the - likelihood of reuse. Regardless of domain-specific nuances, documenting data quality and explicitly stating - known - quality issues in metadata are fundamental practices. Typically, assessing quality involves multiple - dimensions, - each encapsulating characteristics of importance to both data publishers and consumers.
- -The Data Quality Vocabulary (DQV) defines machine-readable concepts such as measurements and criteria to assess - quality - across various dimensions [[VOCAB-DQV]]. Tailored heuristics designed for specific assessment scenarios rely - on - quality indicators, which encompass data content, metadata, and human ratings. These indicators offer valuable - insights into the dataset's suitability for its intended purpose.
- -
- In the context of integrating data quality information into DCAT resources (Dataset, Distribution, Data Service,
- Dataset Series), the Data Quality Vocabulary [[VOCAB-DQV]]
- provides a structured and standardized way to represent and assess quality information for fitness of use. The
- key components of DQV
- relevant to this discussion are dqv:QualityMeasurement
, dqv:Metric
,
- dqv:Dimension
, and the property hasQualityMeasurement
. Here's how each of these
- elements is used:
-
dqv:QualityMeasurement
instance is
- associated with a specific dataset and linked to the metric it measures.
- dqv:Metric
class represents
- the standard or criterion used to assess a particular aspect of quality. Metrics are the yardsticks against
- which quality is evaluated. Each metric is typically associated with a quality dimension. For example, a
- metric could measure the accuracy of data, its timeliness, or its completeness.
- dqv:Dimension
class represents the
- various dimensions or categories of data quality, such as accuracy, timeliness, or completeness. Quality
- dimensions help categorize different aspects of data quality, providing a framework for comprehensive
- assessment.
- hasQualityMeasurement
property is used to link a
- resource to a dqv:QualityMeasurement
. It
- indicates that the dataset has been evaluated in terms of quality and specifies the measurement. This linkage
- is crucial for conveying the results of quality assessments to data consumers, enabling them to understand the
- quality aspects that have been measured and the outcomes of those measurements.
- - Using these DQV elements, data publishers can document the quality of their datasets in a structured and - meaningful way. This documentation includes specific measurements of quality, the criteria used for these - assessments, and the quality dimensions they relate to. The use of DQV thus enhances transparency and helps data - consumers make informed decisions about the suitability of a dataset for their specific needs. -
- -
- The use of shareable controlled vocabularies for dqv:Metric
- and dqv:Dimension
is highly encouraged
- within communities. These standardized vocabularies facilitate consistent and precise communication of data
- quality aspects across different datasets and applications. By adopting such vocabularies, communities can
- ensure that their data quality metrics and dimensions are universally understood, enhancing interoperability and
- the effective use of data across diverse systems and contexts.
-
The quality of a dataset plays a pivotal role in shaping trust, reusability, and the overall performance of + applications that rely on it. As a result, it is imperative to integrate data quality information seamlessly + into + both the data publishing and consumption processes. This inclusion allows for a thorough evaluation of a + dataset's + quality, thereby determining its suitability for a particular application.
+ +Thorough documentation of data quality significantly streamlines the dataset selection process, enhancing the + likelihood of reuse. Regardless of domain-specific nuances, documenting data quality and explicitly stating + known + quality issues in metadata are fundamental practices. Typically, assessing quality involves multiple + dimensions, + each encapsulating characteristics of importance to both data publishers and consumers.
+ +The Data Quality Vocabulary (DQV) defines machine-readable concepts such as measurements and criteria to + assess + quality + across various dimensions [[VOCAB-DQV]]. Tailored heuristics designed for specific assessment scenarios rely + on + quality indicators, which encompass data content, metadata, and human ratings. These indicators offer valuable + insights into the dataset's suitability for its intended purpose.
-
+ In the context of integrating data quality information into DCAT resources (Dataset, Distribution, Data
+ Service,
+ Dataset Series), the Data Quality Vocabulary [[VOCAB-DQV]]
+ provides a structured and standardized way to represent and assess quality information for fitness of use. The
+ key components of DQV
+ relevant to this discussion are dqv:QualityMeasurement
, dqv:Metric
,
+ dqv:Dimension
, and the property hasQualityMeasurement
. Here's how each of these
+ elements is used:
+
dqv:QualityMeasurement
instance is
+ associated with a specific dataset and linked to the metric it measures.
+ dqv:Metric
class
+ represents
+ the standard or criterion used to assess a particular aspect of quality. Metrics are the yardsticks against
+ which quality is evaluated. Each metric is typically associated with a quality dimension. For example, a
+ metric could measure the accuracy of data, its timeliness, or its completeness.
+ dqv:Dimension
class represents the
+ various dimensions or categories of data quality, such as accuracy, timeliness, or completeness. Quality
+ dimensions help categorize different aspects of data quality, providing a framework for comprehensive
+ assessment.
+ hasQualityMeasurement
property is used to link a
+ resource to a dqv:QualityMeasurement
. It
+ indicates that the dataset has been evaluated in terms of quality and specifies the measurement. This
+ linkage
+ is crucial for conveying the results of quality assessments to data consumers, enabling them to understand
+ the
+ quality aspects that have been measured and the outcomes of those measurements.
+ Versioning is a concept used to describe the relationship between an original resource and its variations, - updates, or translations. In this section, we explore how DCAT (Data Catalog Vocabulary) is employed to - document - versions resulting from updates or modifications throughout a resource's lifecycle.
++ Using these DQV elements, data publishers can document the quality of their datasets in a structured and + meaningful way. This documentation includes specific measurements of quality, the criteria used for these + assessments, and the quality dimensions they relate to. The use of DQV thus enhances transparency and helps + data + consumers make informed decisions about the suitability of a dataset for their specific needs. +
-DCAT relies on established vocabularies, including the versioning section of the PAV ontology and terms from - [[PAV]], [[DCTERMS]], [[OWL2-OVERVIEW]], and [[VOCAB-ADMS]].
+
+ The use of shareable controlled vocabularies for dqv:Metric
+ and dqv:Dimension
is highly
+ encouraged
+ within communities. These standardized vocabularies facilitate consistent and precise communication of data
+ quality aspects across different datasets and applications. By adopting such vocabularies, communities can
+ ensure that their data quality metrics and dimensions are universally understood, enhancing interoperability
+ and
+ the effective use of data across diverse systems and contexts.
+
It's important to note that versioning applies to all primary DCAT-US resources, including Catalogs, Catalog - Records, Datasets, Dataset Series, and Distributions.
-The versioning approach within DCAT is designed to complement existing methods specific to certain resources - (such as versioning properties for ontologies in [[OWL2-OVERVIEW]]) and those prevalent in particular domains. - A - detailed comparison with other vocabularies can be found in section 11.4: Complementary Approaches to - Versioning. -
+Versioning is closely linked to community conventions, data management strategies, and existing processes. - Data - providers bear the responsibility of determining when and why a new version should be released.
- -Handling Dataset Changes
-Datasets published on the Web are subject to change over time. Some datasets are updated on a regular - schedule, - while others evolve as improvements in data collection methods make updates necessary. To manage these changes - effectively, new versions of a dataset may be created. However, there isn't a unanimous consensus on when - changes - to a dataset should categorize it as an entirely new dataset or simply a new version. Below, we outline - scenarios - where most publishers would agree that a revision warrants consideration as a new version:
- -Scenarios:
-In general, when dealing with datasets that represent time series or spatial series, such as data for - different - regions or years, these are not typically regarded as multiple versions of the same dataset. Instead, each - dataset - covers a distinct set of observations about the world and should be treated as a new dataset. This principle - also - applies to datasets collecting data about weekly weather forecasts for a specific city, where a new dataset is - created each week to store data for that particular week.
- -Scenario 1 and 2 may trigger major version updates, while Scenario 3 is likely to trigger only a minor - version - update. However, the distinction between minor and major versions is less critical than ensuring that any - changes - are clearly indicated by incrementing the version number. Even for minor changes, maintaining a record of - different dataset versions is essential for ensuring the dataset's reliability. Publishers should be mindful - that - a dataset may be in use by one or more data consumers, and they should take reasonable steps to inform those - consumers when a new version is released. For real-time data, an automated timestamp can serve as a version - identifier. It's crucial for publishers to adopt a consistent and informative approach to versioning for each - dataset, ensuring that data consumers can effectively understand and work with evolving data.
+Versioning is a concept used to describe the relationship between an original resource and its variations, + updates, or translations. In this section, we explore how DCAT (Data Catalog Vocabulary) is employed to + document + versions resulting from updates or modifications throughout a resource's lifecycle.
+DCAT relies on established vocabularies, including the versioning section of the PAV ontology and terms from + [[PAV]], [[DCTERMS]], [[OWL2-OVERVIEW]], and [[VOCAB-ADMS]].
- -It's important to note that versioning applies to all primary DCAT-US resources, including Catalogs, Catalog + Records, Datasets, Dataset Series, and Distributions.
+The versioning approach within DCAT is designed to complement existing methods specific to certain resources + (such as versioning properties for ontologies in [[OWL2-OVERVIEW]]) and those prevalent in particular domains. + A + detailed comparison with other vocabularies can be found in section 11.4: Complementary Approaches to + Versioning. +
- -A Dataset Series is a collection of related datasets that share common characteristics, making them part of a - cohesive group. This section provides guidance on the effective use of Dataset Series within data catalogs, - emphasizing the benefits and considerations for publishers and users alike.
-A Dataset Series is a way for publishers to convey that a dataset is evolving across specific dimensions and - is - available as a set of related datasets. However, choosing to group datasets this way depends on the use case. - Since it demands extra metadata management from the publisher, it's optional. For instance, a dataset updated - frequently via an API may not require individual records for each yearly snapshot unless the publisher wishes - to - share each snapshot's lifecycle.
+Versioning is closely linked to community conventions, data management strategies, and existing processes. + Data + providers bear the responsibility of determining when and why a new version should be released.
+ +Handling Dataset Changes
+Datasets published on the Web are subject to change over time. Some datasets are updated on a regular + schedule, + while others evolve as improvements in data collection methods make updates necessary. To manage these changes + effectively, new versions of a dataset may be created. However, there isn't a unanimous consensus on when + changes + to a dataset should categorize it as an entirely new dataset or simply a new version. Below, we outline + scenarios + where most publishers would agree that a revision warrants consideration as a new version:
+ +Scenarios:
+Implementing Dataset Series offers several advantages:
-When using Dataset Series, consider the following best practices:
-In general, when dealing with datasets that represent time series or spatial series, such as data for + different + regions or years, these are not typically regarded as multiple versions of the same dataset. Instead, each dataset - is significant independently and contributes to the series' overall narrative.
Articulating the interconnections between datasets in a series is crucial for user understanding and data - management:
-Being part of a Dataset Series may necessitate specific metadata considerations:
-Scenario 1 and 2 may trigger major version updates, while Scenario 3 is likely to trigger only a minor + version + update. However, the distinction between minor and major versions is less critical than ensuring that any + changes + are clearly indicated by incrementing the version number. Even for minor changes, maintaining a record of + different dataset versions is essential for ensuring the dataset's reliability. Publishers should be mindful + that + a dataset may be in use by one or more data consumers, and they should take reasonable steps to inform those + consumers when a new version is released. For real-time data, an automated timestamp can serve as a version + identifier. It's crucial for publishers to adopt a consistent and informative approach to versioning for each + dataset, ensuring that data consumers can effectively understand and work with evolving data.
-Controlled vocabularies are predetermined sets of terms that have been carefully curated to - ensure consistency, accuracy, and standardized representation of concepts within a specific domain. In the - context of DCAT-US, controlled vocabularies are used to define and constrain the values of specific metadata - elements. These vocabularies enable the creation of a common language for describing datasets, facilitating - data - integration and harmonization across different repositories. -
The use of controlled vocabularies in DCAT-US offers several key benefits:
+
+ A Dataset Series is a collection of related datasets that share common characteristics, making them part of a
+ cohesive group. This section provides guidance on the effective use of Dataset Series within data catalogs,
+ emphasizing the benefits and considerations for publishers and users alike. A Dataset Series is a way for publishers to convey that a dataset is evolving across specific dimensions and
+ is
+ available as a set of related datasets. However, choosing to group datasets this way depends on the use case.
+ Since it demands extra metadata management from the publisher, it's optional. For instance, a dataset updated
+ frequently via an API may not require individual records for each yearly snapshot unless the publisher wishes
+ to
+ share each snapshot's lifecycle. Implementing Dataset Series offers several advantages: The following is a list of requirements that were identified for the controlled vocabularies to
- be recommended
- in this Application Profile. Controlled vocabularies SHOULD: When using Dataset Series, consider the following best practices: These criteria do not intend to define a set of requirements for controlled vocabularies in
- general; they are
- only intended to be used for the selection of the controlled vocabularies that are proposed for
- this Application
- Profile. In the table below, a number of properties are listed with controlled vocabularies that MUST be
- used for the
- listed properties. The declaration of the following controlled vocabularies as mandatory ensures a
- minimum level
- of interoperability. Compared with [[DCAT-AP-20200608]], DCAT-US makes use of additional controlled vocabularies
- mandated by
- [[DATA-GOV-REG]], and operated by the Data.gov Registry - with the only exceptions of the
- coordinate reference
- systems register maintained by OGC [[OGC-EPSG]]. For two of these controlled vocabularies, namely the NGDA spatial data themes [[NGDA-THEMES]] and
- the ISO
- topic categories [[ISO-19115-1]], the DCAT-US Working Group has defined a set of harmonised mappings to
- the Data.gov Vocabularies Data Themes [[DATA-GOV-THEME]], in order to facilitate the identification of the
- relevant theme in [[DATA-GOV-THEME]] for geospatial/statistical metadata. In addition to the proposed common vocabularies in , which are mandatory to ensure minimal
- interoperability,
- implementers are encouraged to publish and to use further region or domain-specific vocabularies
- that are
- available online. While those may not be recognised by general implementations of the Application
- Profile, they
- may serve to increase interoperability across applications in the same region or domain. Examples
- are the full
- set of concepts in GCMD [[GCMD]],and numerous other schemes. For geospatial metadata, the working group has identified the following additional vocabularies:
- Articulating the interconnections between datasets in a series is crucial for user understanding and data
+ management: Geographic identifiers: For marine regions: Marine Regions http://www.marineregions.org/ SeaVoX salt and fresh water body gazetteer - https://www.bodc.ac.uk/data/codes_and_formats/seavox/
- General: DBpedia for Geographic Placenames - http://dbpedia.org/about
- National gazetteer vocabularies where feasible SeaVoX salt and fresh water body gazetteer for ‘marine geonames’ - https://www.bodc.ac.uk/data/codes_and_formats/seavox/
- Keywords (with controlled vocabularies): Being part of a Dataset Series may necessitate specific metadata considerations: Controlled vocabularies are predetermined sets of terms that have been carefully curated to
+ ensure consistency, accuracy, and standardized representation of concepts within a specific domain. In the
+ context of DCAT-US, controlled vocabularies are used to define and constrain the values of specific metadata
+ elements. These vocabularies enable the creation of a common language for describing datasets, facilitating
+ data
+ integration and harmonization across different repositories.
+
+ The use of controlled vocabularies in DCAT-US offers several key benefits:
+
+ The following is a list of requirements that were identified for the controlled vocabularies to
+ be recommended
+ in this Application Profile. Controlled vocabularies SHOULD: These criteria do not intend to define a set of requirements for controlled vocabularies in
+ general; they are
+ only intended to be used for the selection of the controlled vocabularies that are proposed for
+ this Application
+ Profile. In the table below, a number of properties are listed with controlled vocabularies that MUST be
+ used for the
+ listed properties. The declaration of the following controlled vocabularies as mandatory ensures a
+ minimum level
+ of interoperability. Compared with [[DCAT-AP-20200608]], DCAT-US makes use of additional controlled vocabularies
+ mandated by
+ [[DATA-GOV-REG]], and operated by the Data.gov Registry - with the only exceptions of the
+ coordinate reference
+ systems register maintained by OGC [[OGC-EPSG]]. For two of these controlled vocabularies, namely the NGDA spatial data themes [[NGDA-THEMES]] and
+ the ISO
+ topic categories [[ISO-19115-1]], the DCAT-US Working Group has defined a set of harmonised mappings to
+ the Data.gov Vocabularies Data Themes [[DATA-GOV-THEME]], in order to facilitate the identification of the
+ relevant theme in [[DATA-GOV-THEME]] for geospatial/statistical metadata. In addition to the proposed common vocabularies in , which are mandatory to ensure minimal
+ interoperability,
+ implementers are encouraged to publish and to use further region or domain-specific vocabularies
+ that are
+ available online. While those may not be recognised by general implementations of the Application
+ Profile, they
+ may serve to increase interoperability across applications in the same region or domain. Examples
+ are the full
+ set of concepts in GCMD [[GCMD]],and numerous other schemes. For geospatial metadata, the working group has identified the following additional vocabularies:
+ Geographic identifiers: For marine regions: Marine Regions http://www.marineregions.org/ SeaVoX salt and fresh water body gazetteer - https://www.bodc.ac.uk/data/codes_and_formats/seavox/
+ General: DBpedia for Geographic Placenames - http://dbpedia.org/about
+ National gazetteer vocabularies where feasible SeaVoX salt and fresh water body gazetteer for ‘marine geonames’ - https://www.bodc.ac.uk/data/codes_and_formats/seavox/
+ Keywords (with controlled vocabularies): One common technical question is the format in which the data is being exchanged.
- For DCAT-US 3.0 conformance, it is not mandatory that this happens in a RDF serialisation, but the exchanged
- format SHOULD be unambiguously be transformable into RDF.
- For the format JSON, a popular format to exchange data between systems, DCAT-US profile provides a JSON-LD context
- file.
- JSON-LD is a W3C Recommendation [[[json-ld11]]] that provided a standard approach to interpret JSON structures
- as RDF. The provided JSON-LD context file can be used by implementers to base their data exchange upon, and so
- create
- a DCAT-US conformant data exchange. This JSON-LD context is not normative, i.e. other JSON-LD contexts are
- allowed
- to create a a conformant
- DCAT-US data exchange. The JSON-LD context file downloadable here. Dataset Series
+ Why Use Dataset Series?
+
-
-
- Requirements for controlled vocabularies
-
- Guidelines for Implementing Dataset Series
+
-
-
- Controlled vocabularies to be used
-
- Other controlled vocabularies
-
- Expressing Relationships and Connections
+
-
+
-
-
-
-
-
- Impact on Metadata
+
+
-
Controlled Vocabularies
+
+ Importance of Controlled Vocabularies
+
+
+
+ Requirements for controlled vocabularies
+
+
+
+
+ Controlled vocabularies to be used
+
+ Other controlled vocabularies
+
+
+
+
+
+
+
+
+ JSON-LD context file
-
-
One common technical question is the format in which the data is being exchanged. - For DCAT-US 3.0 conformance, it is not mandatory that this happens in a RDF serialisation, but the exchanged - format SHOULD be unambiguously be transformable into RDF.
-For JSON, which is a widely adopted format for data exchange between systems, the DCAT-US profile offers an - informative JSON Schema. This schema aids in understanding the structure expected for DCAT-US compliant data - exchanges in JSON format.
+ +One common technical question is the format in which the data is being exchanged. + For DCAT-US 3.0 conformance, it is not mandatory that this happens in a RDF serialisation, but the exchanged + format SHOULD be unambiguously be transformable into RDF. + For the format JSON, a popular format to exchange data between systems, DCAT-US profile provides a JSON-LD context + file. + JSON-LD is a W3C Recommendation [[[json-ld11]]] that provided a standard approach to interpret JSON structures + as RDF. The provided JSON-LD context file can be used by implementers to base their data exchange upon, and so + create + a DCAT-US conformant data exchange. This JSON-LD context is not normative, i.e. other JSON-LD contexts are + allowed + to create a a conformant + DCAT-US data exchange. The JSON-LD context file downloadable here.
-- JSON Schema offers a compact way to describe and validate the structure and content of JSON data, ensuring - specific formatting and value constraints. However, it's more limited than JSON-LD context and RDF serialization - due to its focus on structure over meaning. -
+- JSON Schema's focus on structural validation forms a contrast with JSON-LD and RDF's capabilities. JSON-LD and - RDF - go beyond just validation, allowing the creation of a graph of interconnected entities that can be easily - integrated and reused across various contexts. This interconnectedness is fundamental to the concept of the - semantic web, where data is not only readable but also comprehensible to machines. -
-- Specifically, JSON-LD facilitates the representation of data as a graph, making it suitable for more complex, - interlinked data representations, which is a cornerstone of linked data systems. This graph-based approach - stands - in contrast to the tree-like structures that JSON Schema is confined to, limiting its utility in scenarios - requiring extensive data interconnectivity and reusability. -
-- Implementers can use the provided JSON Schema for their data exchanges, aligning with DCAT-US standards. - However, - it's non-normative, meaning alternatives creating compliant exchanges are also valid. Download the current JSON - Schema here. -
+ +One common technical question is the format in which the data is being exchanged. + For DCAT-US 3.0 conformance, it is not mandatory that this happens in a RDF serialisation, but the exchanged + format SHOULD be unambiguously be transformable into RDF.
+For JSON, which is a widely adopted format for data exchange between systems, the DCAT-US profile offers an + informative JSON Schema. This schema aids in understanding the structure expected for DCAT-US compliant data + exchanges in JSON format.
- -+ JSON Schema offers a compact way to describe and validate the structure and content of JSON data, ensuring + specific formatting and value constraints. However, it's more limited than JSON-LD context and RDF serialization + due to its focus on structure over meaning. +
-+ JSON Schema's focus on structural validation forms a contrast with JSON-LD and RDF's capabilities. JSON-LD and + RDF + go beyond just validation, allowing the creation of a graph of interconnected entities that can be easily + integrated and reused across various contexts. This interconnectedness is fundamental to the concept of the + semantic web, where data is not only readable but also comprehensible to machines. +
++ Specifically, JSON-LD facilitates the representation of data as a graph, making it suitable for more complex, + interlinked data representations, which is a cornerstone of linked data systems. This graph-based approach + stands + in contrast to the tree-like structures that JSON Schema is confined to, limiting its utility in scenarios + requiring extensive data interconnectivity and reusability. +
++ Implementers can use the provided JSON Schema for their data exchanges, aligning with DCAT-US standards. + However, + it's non-normative, meaning alternatives creating compliant exchanges are also valid. Download the current JSON + Schema here. +
+In order to verify whether a catalog adheres to the stipulated constraints in this Application Profile, the - constraints are articulated utilizing SHACL [[SHACL]]. All constraints in this specification that were amenable - to - SHACL expression translation have been incorporated. Consequently, this set of SHACL expressions can be employed - to construct a validation check for data exchange between two systems, a common scenario being one catalog being - harvested into another.
-For example, it may be recognized that the data being exchanged doesn't include the organizations' details - since - they are uniquely identified by a deferenceable URI. In this scenario, enforcing rules about the mandatory - presence of a name for each organization may not be pertinent. Rigorously applying the DCAT-AP SHACL expressions - would trigger errors, even though the data is accessible via an alternative route. In this context, it's - acceptable to omit this check during the validation phase.
+ +This example underscores that to achieve an optimal user experience during a validation process, it's crucial - to - consider the actual data transferred between systems and apply only the constraints relevant to the data - exchange. - To facilitate this, the SHACL expressions are organized into separate files, aligning with common validation - configurations.
-The SHACL application profile for DCAT-US can be found here
+Namespaces and prefixes used in normative parts of this recommendation are shown in the following - table:
-Prefix | -Namespace IRI | -Source | -
---|---|---|
adms |
- http://www.w3.org/ns/adms# |
- [[VOCAB-ADMS]] | -
cnt |
- http://www.w3.org/2011/content# |
- [[Content-in-RDF10]] | -
dcat |
- https://www.w3.org/TR/vocab-dcat-3/ |
- [[VOCAB-DCAT]] | -
dcat-us |
- http://resources.data.gov/ontology/dcat-us# |
- [[DCAT-US]] | -
dct |
- http://purl.org/dc/terms/ |
- [[DCTERMS]] | -
dqv |
- https://www.w3.org/TR/vocab-dqv/ |
- [[VOCAB-DQV]] | -
foaf |
- http://xmlns.com/foaf/0.1/ |
- [[FOAF]] | -
gsp |
- http://www.opengis.net/ont/geosparql# |
- [[GeoSPARQL]] | -
locn |
- http://www.w3.org/ns/locn# |
- [[LOCN]] | -
odrs |
- https://schema.theodi.org/odrs/ |
- [[ODRS]] | -
org |
- http://www.w3c.org/ns/org# |
- [[VOCAB-ORG]] | -
prov |
- http://www.w3.org/ns/prov# |
- [[PROV]] | -
rdf |
- http://www.w3.org/1999/02/22-rdf-syntax-ns# |
- [[RDF-SYNTAX-GRAMMAR]] | -
rdfs |
- http://www.w3.org/2000/01/rdf-schema# |
- [[RDF-SCHEMA]] | -
schema |
- http://schema.org/ |
- [[schema-org]] | -
sdmx-attribute |
- http://purl.org/linked-data/sdmx/2009/attribute# |
- [[SDMX-ATTRIBUTE]] | -
skos |
- http://www.w3.org/2004/02/skos/core# |
- [[SKOS-REFERENCE]] | -
spdx |
- http://spdx.org/rdf/terms# |
- [[SPDX]] | -
vcard |
- http://www.w3.org/2006/vcard/ns# |
- [[VCARD-RDF]] | -
xsd |
- http://www.w3.org/2001/XMLSchema# |
- [[XMLSCHEMA11-2]] | -
Prefix | +Namespace IRI | +Source | +
---|---|---|
adms |
+ http://www.w3.org/ns/adms# |
+ [[VOCAB-ADMS]] | +
cnt |
+ http://www.w3.org/2011/content# |
+ [[Content-in-RDF10]] | +
dcat |
+ https://www.w3.org/TR/vocab-dcat-3/ |
+ [[VOCAB-DCAT]] | +
dcat-us |
+ http://resources.data.gov/ontology/dcat-us# |
+ [[DCAT-US]] | +
dct |
+ http://purl.org/dc/terms/ |
+ [[DCTERMS]] | +
dqv |
+ https://www.w3.org/TR/vocab-dqv/ |
+ [[VOCAB-DQV]] | +
foaf |
+ http://xmlns.com/foaf/0.1/ |
+ [[FOAF]] | +
gsp |
+ http://www.opengis.net/ont/geosparql# |
+ [[GeoSPARQL]] | +
locn |
+ http://www.w3.org/ns/locn# |
+ [[LOCN]] | +
odrs |
+ https://schema.theodi.org/odrs/ |
+ [[ODRS]] | +
org |
+ http://www.w3c.org/ns/org# |
+ [[VOCAB-ORG]] | +
prov |
+ http://www.w3.org/ns/prov# |
+ [[PROV]] | +
rdf |
+ http://www.w3.org/1999/02/22-rdf-syntax-ns# |
+ [[RDF-SYNTAX-GRAMMAR]] | +
rdfs |
+ http://www.w3.org/2000/01/rdf-schema# |
+ [[RDF-SCHEMA]] | +
schema |
+ http://schema.org/ |
+ [[schema-org]] | +
sdmx-attribute |
+ http://purl.org/linked-data/sdmx/2009/attribute# |
+ [[SDMX-ATTRIBUTE]] | +
skos |
+ http://www.w3.org/2004/02/skos/core# |
+ [[SKOS-REFERENCE]] | +
spdx |
+ http://spdx.org/rdf/terms# |
+ [[SPDX]] | +
vcard |
+ http://www.w3.org/2006/vcard/ns# |
+ [[VCARD-RDF]] | +
xsd |
+ http://www.w3.org/2001/XMLSchema# |
+ [[XMLSCHEMA11-2]] | +