From 530b3cbbe9a5ed7be87f9bdae57c034ed4bab6bb Mon Sep 17 00:00:00 2001
From: stephanef Controlled vocabularies, including taxonomies and thesauri, dramatically enhance data searchability. Utilizing
+ these vocabularies allows datasets to be systematically classified, tagged, and described with standardized
+ terms, aiding users in retrieving relevant datasets, even when using varied terms or synonyms. Employing controlled vocabularies enables semantic search, which comprehends the context and
+ relationships behind search terms. This approach enhances search results, for example, linking "automobiles"
+ with related terms like "cars" or "vehicles". This enriched search experience is crucial for navigating vast, diverse datasets, ensuring comprehensive and
+ relevant results, and bridging the gap between user intent and dataset content. The DCAT-US profile utilizes properties from the DCAT 3 framework for resource classification, providing
+ flexibility in the choice of
+ controlled vocabularies to meet the specific needs of various communities or agencies.
+ dcterms:type: This property specifies the category or genre ofgc
+ the content in a resource. It's applicable to dcat:Dataset, dcat:DataService, and dcat:DatasetSeries. For dcat:DataService, types might include "Web Map Service" (WMS) for
+ services providing geographical data in a map format, "Web Feature Service" (WFS) for services allowing
+ users to access geospatial features, or "RESTful API" for services using REST API protocols. For datasets,
+ types can be "Geospatial Dataset", "Image", "Statistical Dataset", or "Map". The Dublin Core Type Vocabulary
+ is a popular choice for providing standardized descriptors.
+
+ dcat:keyword: This property allows for the tagging of
+ datasets with relevant keywords, facilitating easier discovery and categorization. It is suitable for use
+ with dcat:Dataset, dcat:DataService, dcat:DatasetSeries, and dcat:Catalog. Employing keywords from established vocabularies such as
+ AGROVOC (for agricultural terms), Global Change Master Directory (GCMD) [[?GCMD]] (for Earth science), or NAICS (for industry classifications) ensures
+ consistency and enhances the discoverability of datasets within the US context.
+
+ dcat:theme: This property provides thematic
+ categorization for resources, specifically for dcat:Dataset and dcat:DatasetSeries. Utilizing a unified thematic taxonomy, such
+ as the Data Theme Taxonomy from Data.gov or the FGDC (Federal Geographic Data Committee) Controlled
+ Vocabularies like the ISO 19115 Topic CodeList, ensures a cohesive approach to categorizing datasets. This
+ thematic classification aids users in navigating and identifying datasets relevant to particular subjects or
+ sectors.
+
+ dcterms:subject: Aimed at providing detailed insight into
+ the primary subject matter of a dataset, this property is crucial for dcat:Dataset and dcat:DatasetSeries. Adoption of controlled vocabularies like
+ Global Change Master Directory (GCMD) [[?GCMD]] for Earth science topics, FAO Agrovoc for agricultural subjects, ITIS for taxonomic information, NAICS
+ for industry classifications, or LCSH (Library of Congress Subject Headings) enhances the clarity and
+ searchability of datasets, particularly in the context of US Government data. These vocabularies enable
+ precise and comprehensive subject classification, facilitating more effective data discovery and use.
+ Controlled vocabularies, encompassing taxonomies, thesauri have a transformative impact on data searchability.
- By using these vocabularies, datasets can be classified, tagged, and described with standardized terms and
- phrases. This standardization ensures that users searching with different terms or synonyms can still retrieve
- the most relevant datasets. More than just keyword matching, the use of controlled vocabularies facilitates semantic
- search. This means that the search process understands the context, relationships, and meanings
- behind terms, rather than just the terms themselves. For instance, when using a thesaurus-based vocabulary,
- searching for "automobiles" might also yield results for "cars" or "vehicles". Such an enriched search experience becomes especially vital when dealing with vast and diverse datasets. It
- ensures that users can find the most relevant and comprehensive results, even if the exact phrasing or
- terminology varies between the user's query and the dataset's metadata. In essence, controlled vocabularies
- bridge the gap between user intent and dataset content, leading to more accurate and meaningful search outcomes.
- The DCAT US profile uses a range of properties from the DCAT 3 framework to classify and categorize resources,
- helping users and systems understand and navigate resources.
- The
- Relevant for dcat:Dataset, dcat:DataService, dcat:Catalog, and dcat:DatasetSeries, the
-
- Applicable to dcat:Dataset and dcat:DatasetSeries, the
-
- Suitable for dcat:Dataset and dcat:DatasetSeries, the
- Concept
accessed, integrated with other resources, and reused across the DCAT-US ecosystem, promoting data
interoperability and accessibility.
Concept Scheme
using SKOS encoding and provided in Linked Data format (RDF/XML,TTL, JSON-LD, NTriples)
Extended Attributions and Diverse Roles
Resource Classification
+
+
- Resource types
- dcterms:type
property specifies the nature or genre of content and is applicable to
- dcat:Dataset, dcat:DataService, and dcat:DatasetSeries. For instance, types
- might include "Geospatial Dataset", "Image", "Statistical Dataset", or "Map". The Dublin Core Type Vocabulary
- is for example a popular vocabulary used to categorize datasets.
- Keywords
- dcat:keyword
property allows datasets to be tagged with pertinent terms represented as literals.
- Using keywords from AGROVOC, GCMD, or the North American Industry
- Classification System (NAICS) can enhance consistency in the US context.
- Thematic Classification
- dcat:theme
- property offers thematic categorization. The Data Theme Taxonomy from Data.gov (TBD) and the
- Federal Geographic Data Committee (FGDC) Controlled Vocabularies such as ISO 19115 Topic
- CodeList
- and
- Geoplatform NSDI Themes are widely used in the US to ensure a unified theming approach.
- Subject Classification
- dcterms:subject
- property provides deeper insight into a dataset's primary subject. Adopting vocabularies like the
- Global Change Master Directory (GCMD) FAO Agrovoc, the Integrated
- Taxonomic Information System (ITIS), the North American Industry Classification System
- (NAICS) or Library of Congress Subject Headings (LCSH), can optimize clarity and
- searchability in US Governement datasets.
- Spatial Metadata
@@ -21520,7 +21529,7 @@ Other controlled vocabularies
Profile, they
may serve to increase interoperability across applications in the same region or domain. Examples
are the full
- set of concepts in GCMD [[GCMD]],and numerous other schemes.
For geospatial metadata, the working group has identified the following additional vocabularies: