Skip to content

Commit

Permalink
Merge remote-tracking branch 'origin/main' into feature/add_union_types
Browse files Browse the repository at this point in the history
  • Loading branch information
syou6162 committed May 22, 2024
2 parents 3513aa3 + 8a1feeb commit 7a71273
Show file tree
Hide file tree
Showing 34 changed files with 33,904 additions and 7,820 deletions.
32 changes: 31 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,11 +6,25 @@
<img src="https://img.shields.io/pypi/pyversions/dbt-artifacts-parser.svg?color=%2334D058" alt="Supported Python versions">
</a>


# dbt-artifacts-parser

This is a dbt artifacts parse in python.
It enables us to deal with `catalog.json`, `manifest.json`, `run-results.json` and `sources.json` as python objects.

## Supported Versions and Compatibility

> **⚠️ Important Note:**
>
> - **Pydantic v1 will not be supported for dbt 1.9 or later.**
> - **To parse dbt 1.9 or later, please migrate your code to pydantic v2.**
> - **We will reassess version compatibility upon the release of pydantic v3.**
| Version | Supported dbt Version | Supported pydantic Version |
|---------|-----------------------|----------------------------|
| 0.7 | dbt 1.5 to 1.8 | pydantic v2 |
| 0.6 | dbt 1.5 to 1.8 | pydantic v1 |
| 0.5 | dbt 1.5 to 1.7 | pydantic v1 |

## Installation

```bash
Expand All @@ -35,13 +49,15 @@ Those are the classes to parse dbt artifacts.
- [ManifestV9](dbt_artifacts_parser/parsers/manifest/manifest_v9.py) for manifest.json v9
- [ManifestV10](dbt_artifacts_parser/parsers/manifest/manifest_v10.py) for manifest.json v10
- [ManifestV11](dbt_artifacts_parser/parsers/manifest/manifest_v11.py) for manifest.json v11
- [ManifestV12](dbt_artifacts_parser/parsers/manifest/manifest_v12.py) for manifest.json v12

### Run Results
- [RunResultsV1](dbt_artifacts_parser/parsers/manifest/manifest_v1.py) for run_results.json v1
- [RunResultsV2](dbt_artifacts_parser/parsers/manifest/manifest_v2.py) for run_results.json v2
- [RunResultsV3](dbt_artifacts_parser/parsers/manifest/manifest_v3.py) for run_results.json v3
- [RunResultsV4](dbt_artifacts_parser/parsers/manifest/manifest_v4.py) for run_results.json v4
- [RunResultsV5](dbt_artifacts_parser/parsers/manifest/manifest_v5.py) for run_results.json v5
- [RunResultsV6](dbt_artifacts_parser/parsers/manifest/manifest_v6.py) for run_results.json v6

### Sources
- [SourcesV1](dbt_artifacts_parser/parsers/sources/sources_v1.py) for sources.json v1
Expand Down Expand Up @@ -157,6 +173,13 @@ from dbt_artifacts_parser.parser import parse_manifest_v11
with open("path/to/manifest.json", "r") as fp:
manifest_dict = json.load(fp)
manifest_obj = parse_manifest_v11(manifest=manifest_dict)

# parse manifest.json v12
from dbt_artifacts_parser.parser import parse_manifest_v12

with open("path/to/manifest.json", "r") as fp:
manifest_dict = json.load(fp)
manifest_obj = parse_manifest_v12(manifest=manifest_dict)
```

### Parse run-results.json
Expand Down Expand Up @@ -205,6 +228,13 @@ from dbt_artifacts_parser.parser import parse_run_results_v5
with open("path/to/run-results.json", "r") as fp:
run_results_dict = json.load(fp)
run_results_obj = parse_run_results_v5(run_results=run_results_dict)

# parse run-results.json v6
from dbt_artifacts_parser.parser import parse_run_results_v6

with open("path/to/run-results.json", "r") as fp:
run_results_dict = json.load(fp)
run_results_obj = parse_run_results_v6(run_results=run_results_dict)
```

### Parse sources.json
Expand Down
2 changes: 1 addition & 1 deletion dbt_artifacts_parser/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,4 +17,4 @@
"""
A dbt artifacts parser in python
"""
__version__ = "0.5.1"
__version__ = "0.6.0"
39 changes: 34 additions & 5 deletions dbt_artifacts_parser/parser.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,11 +28,13 @@
from dbt_artifacts_parser.parsers.manifest.manifest_v9 import ManifestV9
from dbt_artifacts_parser.parsers.manifest.manifest_v10 import ManifestV10
from dbt_artifacts_parser.parsers.manifest.manifest_v11 import ManifestV11
from dbt_artifacts_parser.parsers.manifest.manifest_v12 import ManifestV12
from dbt_artifacts_parser.parsers.run_results.run_results_v1 import RunResultsV1
from dbt_artifacts_parser.parsers.run_results.run_results_v2 import RunResultsV2
from dbt_artifacts_parser.parsers.run_results.run_results_v3 import RunResultsV3
from dbt_artifacts_parser.parsers.run_results.run_results_v4 import RunResultsV4
from dbt_artifacts_parser.parsers.run_results.run_results_v5 import RunResultsV5
from dbt_artifacts_parser.parsers.run_results.run_results_v6 import RunResultsV6
from dbt_artifacts_parser.parsers.sources.sources_v1 import SourcesV1
from dbt_artifacts_parser.parsers.sources.sources_v2 import SourcesV2
from dbt_artifacts_parser.parsers.sources.sources_v3 import SourcesV3
Expand Down Expand Up @@ -83,6 +85,7 @@ def parse_catalog_v1(catalog: dict) -> CatalogV1:
ManifestV9,
ManifestV10,
ManifestV11,
ManifestV12,
]


Expand Down Expand Up @@ -118,6 +121,8 @@ def parse_manifest(manifest: dict) -> Manifest:
return ManifestV10(**manifest)
elif dbt_schema_version == ArtifactTypes.MANIFEST_V11.value.dbt_schema_version:
return ManifestV11(**manifest)
elif dbt_schema_version == ArtifactTypes.MANIFEST_V12.value.dbt_schema_version:
return ManifestV12(**manifest)
raise ValueError("Not a manifest.json")


Expand Down Expand Up @@ -169,52 +174,67 @@ def parse_manifest_v6(manifest: dict) -> ManifestV6:
raise ValueError("Not a manifest.json v6")


def parse_manifest_v7(manifest: dict) -> ManifestV6:
def parse_manifest_v7(manifest: dict) -> ManifestV7:
"""Parse manifest.json ver.7"""
dbt_schema_version = get_dbt_schema_version(artifact_json=manifest)
if dbt_schema_version == ArtifactTypes.MANIFEST_V7.value.dbt_schema_version:
return ManifestV7(**manifest)
raise ValueError("Not a manifest.json v7")


def parse_manifest_v8(manifest: dict) -> ManifestV6:
def parse_manifest_v8(manifest: dict) -> ManifestV8:
"""Parse manifest.json ver.8"""
dbt_schema_version = get_dbt_schema_version(artifact_json=manifest)
if dbt_schema_version == ArtifactTypes.MANIFEST_V8.value.dbt_schema_version:
return ManifestV8(**manifest)
raise ValueError("Not a manifest.json v8")


def parse_manifest_v9(manifest: dict) -> ManifestV6:
def parse_manifest_v9(manifest: dict) -> ManifestV9:
"""Parse manifest.json ver.9"""
dbt_schema_version = get_dbt_schema_version(artifact_json=manifest)
if dbt_schema_version == ArtifactTypes.MANIFEST_V9.value.dbt_schema_version:
return ManifestV9(**manifest)
raise ValueError("Not a manifest.json v9")

def parse_manifest_v10(manifest: dict) -> ManifestV6:
def parse_manifest_v10(manifest: dict) -> ManifestV10:
"""Parse manifest.json ver.10"""
dbt_schema_version = get_dbt_schema_version(artifact_json=manifest)
if dbt_schema_version == ArtifactTypes.MANIFEST_V10.value.dbt_schema_version:
return ManifestV10(**manifest)
raise ValueError("Not a manifest.json v10")


def parse_manifest_v11(manifest: dict) -> ManifestV6:
def parse_manifest_v11(manifest: dict) -> ManifestV11:
"""Parse manifest.json ver.11"""
dbt_schema_version = get_dbt_schema_version(artifact_json=manifest)
if dbt_schema_version == ArtifactTypes.MANIFEST_V11.value.dbt_schema_version:
return ManifestV11(**manifest)
raise ValueError("Not a manifest.json v11")


def parse_manifest_v12(manifest: dict) -> ManifestV12:
"""Parse manifest.json ver.12"""
dbt_schema_version = get_dbt_schema_version(artifact_json=manifest)
if dbt_schema_version == ArtifactTypes.MANIFEST_V12.value.dbt_schema_version:
return ManifestV12(**manifest)
raise ValueError("Not a manifest.json v12")


#
# run-results
#
<<<<<<< HEAD
RunResults: TypeAlias = Union[RunResultsV1, RunResultsV2, RunResultsV3, RunResultsV4, RunResultsV5]


def parse_run_results(run_results: dict) -> RunResults:
=======
def parse_run_results(
run_results: dict,
) -> Union[RunResultsV1, RunResultsV2, RunResultsV3, RunResultsV4,
RunResultsV5, RunResultsV6]:
>>>>>>> origin/main
"""Parse run-results.json
Args:
Expand All @@ -234,6 +254,8 @@ def parse_run_results(run_results: dict) -> RunResults:
return RunResultsV4(**run_results)
elif dbt_schema_version == ArtifactTypes.RUN_RESULTS_V5.value.dbt_schema_version:
return RunResultsV5(**run_results)
elif dbt_schema_version == ArtifactTypes.RUN_RESULTS_V6.value.dbt_schema_version:
return RunResultsV6(**run_results)
raise ValueError("Not a manifest.json")


Expand Down Expand Up @@ -276,6 +298,13 @@ def parse_run_results_v5(run_results: dict) -> RunResultsV5:
raise ValueError("Not a run-results.json v5")


def parse_run_results_v6(run_results: dict) -> RunResultsV6:
"""Parse run-results.json v6"""
dbt_schema_version = get_dbt_schema_version(artifact_json=run_results)
if dbt_schema_version == ArtifactTypes.RUN_RESULTS_V6.value.dbt_schema_version:
return RunResultsV6(**run_results)
raise ValueError("Not a run-results.json v6")

#
# sources
#
Expand Down
69 changes: 30 additions & 39 deletions dbt_artifacts_parser/parsers/catalog/catalog_v1.py
Original file line number Diff line number Diff line change
@@ -1,83 +1,74 @@
# generated by datamodel-codegen:
# filename: catalog_v1.json
# timestamp: 2022-03-01T06:21:30+00:00

from __future__ import annotations

from datetime import datetime
from typing import Dict, List, Optional, Union

from pydantic import Extra, Field
from pydantic import AwareDatetime, ConfigDict, Field

from dbt_artifacts_parser.parsers.base import BaseParserModel


class CatalogMetadata(BaseParserModel):

class Config:
extra = Extra.forbid

dbt_schema_version: Optional[
str] = 'https://schemas.getdbt.com/dbt/catalog/v1.json'
model_config = ConfigDict(
extra='forbid',
)
dbt_schema_version: Optional[str] = 'https://schemas.getdbt.com/dbt/catalog/v1.json'
dbt_version: Optional[str] = '0.19.0'
generated_at: Optional[datetime] = '2021-02-10T04:42:33.680487Z'
invocation_id: Optional[Optional[str]] = None
generated_at: Optional[AwareDatetime] = '2021-02-10T04:42:33.680487Z'
invocation_id: Optional[str] = None
env: Optional[Dict[str, str]] = {}


class TableMetadata(BaseParserModel):

class Config:
extra = Extra.forbid

model_config = ConfigDict(
extra='forbid',
)
type: str
database: Optional[Optional[str]] = None
database: Optional[str] = None
schema_: str = Field(..., alias='schema')
name: str
comment: Optional[Optional[str]] = None
owner: Optional[Optional[str]] = None
comment: Optional[str] = None
owner: Optional[str] = None


class ColumnMetadata(BaseParserModel):

class Config:
extra = Extra.forbid

model_config = ConfigDict(
extra='forbid',
)
type: str
comment: Optional[Optional[str]] = None
comment: Optional[str] = None
index: int
name: str


class StatsItem(BaseParserModel):

class Config:
extra = Extra.forbid

model_config = ConfigDict(
extra='forbid',
)
id: str
label: str
value: Optional[Optional[Union[bool, str, float]]] = None
description: Optional[Optional[str]] = None
value: Optional[Union[bool, str, float]] = None
description: Optional[str] = None
include: bool


class CatalogTable(BaseParserModel):

class Config:
extra = Extra.forbid

model_config = ConfigDict(
extra='forbid',
)
metadata: TableMetadata
columns: Dict[str, ColumnMetadata]
stats: Dict[str, StatsItem]
unique_id: Optional[Optional[str]] = None
unique_id: Optional[str] = None


class CatalogV1(BaseParserModel):

class Config:
extra = Extra.forbid

model_config = ConfigDict(
extra='forbid',
)
metadata: CatalogMetadata
nodes: Dict[str, CatalogTable]
sources: Dict[str, CatalogTable]
errors: Optional[Optional[List[str]]] = None
errors: Optional[List[str]] = None
Loading

0 comments on commit 7a71273

Please sign in to comment.