title | description | image |
---|---|---|
Changelog: Open Data Contract Standard (ODCS) |
Home of Open Data Contract Standard (ODCS) documentation. |
This document tracks the history and evolution of the Open Data Contract Standard.
- Added field
authoritativeDefinitions
into JSON schema - Added field
description.customProperties
into JSON schema - Added field
description.authoritativeDefinitions
into JSON schema - Added field
role.customProperties
into JSON schema - Updated
status
field to include examples - Updated
authoritativeDefinitions
description to be vendor agnostic - Updated
tags
description and included examples
- New section: Support & communication channels.
- New section: Servers.
- Changes to fundamentals :
- Rename
uuid
toid
. - Add
name
. - Rename
quantumName
todataProduct
and make it optional. - Rename
datasetDomain
todomain
(we avoid the dataset prefix). - Drop
datasetKind
(example:virtualDataset
, was optional, have not seen any usage). - Drop
userConsumptionMode
(examples:analytical
, was optional, already deprecated in v2.). - Drop
sourceSystem
(example:bigQuery
, information will be encoded in servers). - Drop
sourcePlatform
(example:googleCloudPlatform
, information will be encoded in servers). - Drop
productSlackChannel
(will move to support channels). - Drop
productFeedbackUrl
(will move to support channels). - Drop
productDl
(will move to support channels). - Drop
username
(credentials should not be stored in the data contract). - Drop
password
(credentials should not be stored in the data contract). - Drop
driverVersion
(will move to servers if needed). - Drop
driver
(will move to servers if needed). - Drop
server
(will move to servers if needed). - Drop
project
(BigQuery-specific, will move to servers). - Drop
datasetName
(BigQuery-specific, will move to servers). - Drop
database
(BigQuery-specific, will move to servers). - Drop
schedulerAppName
(not part of the contract).
- Rename
- Changes to Schema:
- Major changes, check spec.
- Adds support for non table formats, hierarchies, and arrays.
name
is a new fielditems
is a new fieldpriorTableName
is not supported anymore, if needed, consider a custom property.table
is not supported anymore, if needed, consider usingname
.columns
is nowproperties
dataGranularity
is nowdataGranularityDescription
.encryptedColumnName
is nowencryptedName
.partitionStatus
is nowpartitioned
.clusterStatus
is not supported anymore, if needed, consider a custom property.clusterKeyPosition
is not supported anymore, if needed, consider a custom property.sampleValues
is nowexamples
.isNullable
is nowrequired
.isUnique
is nowunique
.isPrimaryKey
is nowprimaryKey
.criticalDataElementStatus
is nowcriticalDataElement
.clusterKeyPosition
is not supported anymore, if needed, consider a custom property.transformSourceTables
is nowtransformSourceObjects
- Restrict
schema.*.logicalType
to be one ofstring
,date
,number
,integer
,object
,array
,boolean
. - Add
schema.*.logicalTypeOptions
.
- Changes to Data Quality:
- Significant changes have been applied to support more tools and use cases. Please review the new section.
- If needed,
templateName
is a custom property. toolName
is obsolete, replaced bytype=custom; engine: <engine name>
.scheduleCronExpression
is replaced byschedule
andscheduler
.scheduleCronExpression: 0 20 * * *
becomesschedule: 0 20 * * *
andscheduler: cron
.
- Pricing:
- No changes.
- Changes to team (fka stakeholders):
- Replaces
stakeholders
. Content stays the same.
- Replaces
- Changes to Role:
- Added
description
- Changed
access
is not required anymore
- Added
- Security:
- No changes.
- Changes to SLA:
- Starting with v3, the schema is not purely tables and columns, hence minor modifications: columns are now elements.
slaDefaultColumn
is nowslaDefaultElement
.column
is nowelement
.- Explicit reference to Data QoS.
- Changes to custom and other properties:
systemInstance
is not supported anymore, if needed, consider a custom property.
- In JSON schema validation:
- Change
dataset.description
data type fromarray
tostring
. - Change
dataset.column.isPrimaryKey
data type fromstring
toboolean
. - Change
price.priceAmount
data type fromstring
tonumber
. - Change
slaProperties.value
data type fromstring
tooneOf[string, number]
. - Change
slaProperties.valueExt
data type fromstring
tooneOf[string, number]
.
- Change
- Update examples to adhere to JSON schema.
- Full example from README directs to full-example.yaml.
- Add in mkdocs for creating a documentation website. Check building-doc.md.
- Add vendors page vendors.md. Feel free to add anyone there.
- Reformat quality examples to be valid YAML.
- Type of definition for authority have standard values:
businessDefinition
,transformationImplementation
,videoTutorial
,tutorial
, andimplementation
. - Add in
isUnique
,primaryKeyPosition
,partitionKeyPosition
, andclusterKeyPosition
tocolumn
definition. - Add JSON schema to validate YAML files for v2.2.1.
- Integrated as part of Bitol.
- Reformat Markdown tables.
- New name to Open Data Contract Standard.
templateName
is now calledstandardVersion
, v2.2.0 parsers should account for this change and support both to avoid a breaking change.- Added support for
authoritativeDefinitions
at the table level. - Added many examples.
- Various improvements and typo corrections.
- Finalization of fork under AIDA User Group.
- Open source version.
- Additional value field
valueExt
in SLA.
The data contract adds elements specifically for interfacing with the Data Quality tooling.
Additions:
- quality (table level & column level check):
- templateName (called standardVersion since v2.2.0)
- dimension
- type
- severity
- businessImpact
- scheduleCronExpression
- customProperties
- columns
- isPrimaryKey
The data contract is a logical construct; we add more specific links to the physical world.
The service-level agreements not previously used are more detailed to follow the DP QoS pattern. See SLA.
Removed the weight for system ratings from the data contract. Their default values remain.
- Type case
- Support for SemVer versioning.
- Tags can have values.
- Version of contract definition: v2.0.0. A breaking change with v1.
- Description:
- Purpose (text field).
- Limitations (text field).
- Usage (text field).
- Domain.
- Dictionary section:
- Identification of masked column (encryptedColumnName property), example: the email_decrypted column would be masked by email_encrypted.
- Flag for critical data element.
- Added keys for transformation data (sources, logic, description).
- Sample values.
- Ability to specify links to authoritative sources at the column level (authoritativeDefinitions).
- Business name.
- List of stakeholders:
- Username (user account).
- Role.
- Date in.
- Date out.
- Replaced by.
- Service levels: agreements & objective orginal inspiration.
- Price / cost.
- Name changes to match PPaaS type case.
- Product data:
- productDl.
- productSlackChannel.
- productFeedbackUrl.
- Renamed
tables
key todataset
. - Removed
owner
key. Owner is now a stakeholder role. - Additional quality keys:
- description.
- toolName.
- toolRuleName.
- Custom Properties.
- Product dates:
- generalAvailabilityDate.
- endOfSupportDate.
- endOfLifeDate.
- Description of the data quantum/data artifact.
- Roles.
- Schema:
- Tables, columns.
- Data quality.
- System rating weightage.
- Ratings:
- System, user, etc.