Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tap-gitlab - sync_pipelines_extended is failing due to difference in schema #79

Open
amalkumarCurve opened this issue May 30, 2022 · 5 comments

Comments

@amalkumarCurve
Copy link

Error:
Traceback (most recent call last):
File "/Users/amalkumar/venv/bin/tap-gitlab", line 11, in
load_entry_point('tap-gitlab==0.9.15', 'console_scripts', 'tap-gitlab')()
File "/Users/amalkumar/venv/lib/python3.7/site-packages/tap_gitlab/init.py", line 959, in main
raise exc
File "/Users/amalkumar/venv/lib/python3.7/site-packages/tap_gitlab/init.py", line 956, in main
main_impl()
File "/Users/amalkumar/venv/lib/python3.7/site-packages/tap_gitlab/init.py", line 951, in main_impl
do_sync()
File "/Users/amalkumar/venv/lib/python3.7/site-packages/tap_gitlab/init.py", line 904, in do_sync
sync_group(gid, pids)
File "/Users/amalkumar/venv/lib/python3.7/site-packages/tap_gitlab/init.py", line 679, in sync_group
sync_project(pid)
File "/Users/amalkumar/venv/lib/python3.7/site-packages/tap_gitlab/init.py", line 834, in sync_project
sync_pipelines(data)
File "/Users/amalkumar/venv/lib/python3.7/site-packages/tap_gitlab/init.py", line 723, in sync_pipelines
sync_pipelines_extended(project, transformed_row)
File "/Users/amalkumar/venv/lib/python3.7/site-packages/tap_gitlab/init.py", line 744, in sync_pipelines_extended
transformed_row = transformer.transform(row, RESOURCES[entity]["schema"], mdata)
File "/Users/amalkumar/venv/lib/python3.7/site-packages/singer/transform.py", line 152, in transform
raise SchemaMismatch(self.errors)
singer.transform.SchemaMismatch: Errors during transform
user: data does not match {'type': 'object', 'properties': {'name': {'type': 'string'}, 'username': {'type': 'string'}, 'id': {'type': 'integer'}, 'state': {'type': 'string'}}}
committed_at: data does not match {'type': 'string', 'format': 'date-time'}
coverage: data does not match {'type': 'number'}
: data does not match {'type': 'object', 'properties': {'project_id': {'type': ['integer', 'null']}, 'id': {'type': ['integer', 'null']}, 'status': {'type': ['string', 'null']}, 'ref': {'type': ['string', 'null']}, 'sha': {'type': ['string', 'null']}, 'before_sha': {'type': ['string', 'null']}, 'tag': {'type': ['boolean', 'null']}, 'yaml_errors': {'type': ['string', 'null']}, 'user': {'type': 'object', 'properties': {'name': {'type': 'string'}, 'username': {'type': 'string'}, 'id': {'type': 'integer'}, 'state': {'type': 'string'}}}, 'created_at': {'anyOf': [{'type': 'string', 'format': 'date-time'}, {'type': 'null'}]}, 'updated_at': {'anyOf': [{'type': 'string', 'format': 'date-time'}, {'type': 'null'}]}, 'started_at': {'anyOf': [{'type': 'string', 'format': 'date-time'}, {'type': 'null'}]}, 'finished_at': {'anyOf': [{'type': 'string', 'format': 'date-time'}, {'type': 'null'}]}, 'committed_at': {'anyOf': [{'type': 'string', 'format': 'date-time'}, {'type': 'null'}]}, 'duration': {'anyOf': [{'type': 'integer'}, {'type': 'null'}]}, 'coverage': {'anyOf': [{'type': 'number'}, {'type': 'null'}]}, 'web_url': {'type': ['string', 'null']}}}

Errors during transform: [user: data does not match {'type': 'object', 'properties': {'name': {'type': 'string'}, 'username': {'type': 'string'}, 'id': {'type': 'integer'}, 'state': {'type': 'string'}}}, committed_at: data does not match {'type': 'string', 'format': 'date-time'}, coverage: data does not match {'type': 'number'}, : data does not match {'type': 'object', 'properties': {'project_id': {'type': ['integer', 'null']}, 'id': {'type': ['integer', 'null']}, 'status': {'type': ['string', 'null']}, 'ref': {'type': ['string', 'null']}, 'sha': {'type': ['string', 'null']}, 'before_sha': {'type': ['string', 'null']}, 'tag': {'type': ['boolean', 'null']}, 'yaml_errors': {'type': ['string', 'null']}, 'user': {'type': 'object', 'properties': {'name': {'type': 'string'}, 'username': {'type': 'string'}, 'id': {'type': 'integer'}, 'state': {'type': 'string'}}}, 'created_at': {'anyOf': [{'type': 'string', 'format': 'date-time'}, {'type': 'null'}]}, 'updated_at': {'anyOf': [{'type': 'string', 'format': 'date-time'}, {'type': 'null'}]}, 'started_at': {'anyOf': [{'type': 'string', 'format': 'date-time'}, {'type': 'null'}]}, 'finished_at': {'anyOf': [{'type': 'string', 'format': 'date-time'}, {'type': 'null'}]}, 'committed_at': {'anyOf': [{'type': 'string', 'format': 'date-time'}, {'type': 'null'}]}, 'duration': {'anyOf': [{'type': 'integer'}, {'type': 'null'}]}, 'coverage': {'anyOf': [{'type': 'number'}, {'type': 'null'}]}, 'web_url': {'type': ['string', 'null']}}}]

Steps to reproduce:
API version: 0.9.15 & 0.10.0
Python: 3.7.3

Config:
{
"api_url": "https://",
"private_token": "",
"groups": "",
"projects": "",
"start_date": "",
"ultimate_license": true,
"fetch_merge_request_commits": true,
"fetch_pipelines_extended": true
}

Command:
tap-gitlab --config tap-gitlab-config

@laurentS
Copy link

Hi @amalkumarCurve it looks like you're using version 0.9.15 of the tap, which is from the legacy-stable branch and is not actively maintained anymore. Is that correct?

If so, can you try switching to version 2 or directly using code from the main branch?

@amalkumarCurve
Copy link
Author

Hi @laurentS ,

Thanks for your response.

I tried using version 2 i.e. (https://github.com/MeltanoLabs/tap-gitlab/releases/tag/v2.0.0-alpha4). However, getting a different issue now. i.e. 403 Client Error: Forbidden for path: /groups/{group_id}/variables .

Since in my config I disabled the flag fetch_group_variables, Ideally path /groups/{group_id}/variables should be ignored. Isn't?

config
{
"api_url": "api/v4",
"private_token": "<private_token>",
"groups": "",
"projects": "",
"start_date": "2022-04-23T00:00:00Z",
"ultimate_license": true,
"fetch_merge_request_commits": false,
"fetch_pipelines_extended": false,
"fetch_group_variables": false,
"fetch_project_variables": false
}

Installation logs:
~ ❯ pip install git+https://github.com/MeltanoLabs/[email protected]
Collecting git+https://github.com/MeltanoLabs/[email protected]
Cloning https://github.com/MeltanoLabs/tap-gitlab.git (to revision v2.0.0-alpha4) to /private/var/folders/gw/j33zqy31447crctz4n5ct4nm0000gp/T/pip-req-build-h_ylqe3v
Running command git clone --filter=blob:none --quiet https://github.com/MeltanoLabs/tap-gitlab.git /private/var/folders/gw/j33zqy31447crctz4n5ct4nm0000gp/T/pip-req-build-h_ylqe3v
Running command git checkout -q 7d285b8
Resolved https://github.com/MeltanoLabs/tap-gitlab.git to commit 7d285b8
Installing build dependencies ... done
Getting requirements to build wheel ... done
Preparing metadata (pyproject.toml) ... done
Collecting requests<3.0.0,>=2.25.1
Using cached requests-2.27.1-py2.py3-none-any.whl (63 kB)
Collecting requests-cache<0.10.0,>=0.9.3
Downloading requests_cache-0.9.4-py3-none-any.whl (47 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 47.2/47.2 kB 1.5 MB/s eta 0:00:00
Collecting singer-sdk<0.5.0,>=0.4.4
Downloading singer_sdk-0.4.9-py3-none-any.whl (97 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 97.3/97.3 kB 3.2 MB/s eta 0:00:00
Collecting PyYAML<7.0,>=6.0
Using cached PyYAML-6.0-cp37-cp37m-macosx_10_9_x86_64.whl (189 kB)
Collecting charset-normalizer~=2.0.0
Using cached charset_normalizer-2.0.12-py3-none-any.whl (39 kB)
Collecting idna<4,>=2.5
Using cached idna-3.3-py3-none-any.whl (61 kB)
Collecting certifi>=2017.4.17
Using cached certifi-2022.5.18.1-py3-none-any.whl (155 kB)
Collecting urllib3<1.27,>=1.21.1
Using cached urllib3-1.26.9-py2.py3-none-any.whl (138 kB)
Collecting attrs<22.0,>=21.2
Using cached attrs-21.4.0-py2.py3-none-any.whl (60 kB)
Collecting appdirs<2.0.0,>=1.4.4
Using cached appdirs-1.4.4-py2.py3-none-any.whl (9.6 kB)
Collecting url-normalize<2.0,>=1.4
Downloading url_normalize-1.4.3-py2.py3-none-any.whl (6.8 kB)
Collecting cattrs<2.0,>=1.8
Using cached cattrs-1.10.0-py3-none-any.whl (29 kB)
Collecting memoization<0.4.0,>=0.3.2
Downloading memoization-0.3.2-py3-none-any.whl (38 kB)
Collecting importlib-metadata
Using cached importlib_metadata-4.11.4-py3-none-any.whl (18 kB)
Collecting inflection<0.6.0,>=0.5.1
Using cached inflection-0.5.1-py2.py3-none-any.whl (9.5 kB)
Collecting cryptography<4.0.0,>=3.4.6
Using cached cryptography-3.4.8-cp36-abi3-macosx_10_10_x86_64.whl (2.0 MB)
Collecting PyJWT<3.0,>=2.3
Using cached PyJWT-2.4.0-py3-none-any.whl (18 kB)
Collecting joblib<2.0.0,>=1.0.1
Using cached joblib-1.1.0-py2.py3-none-any.whl (306 kB)
Collecting pendulum<3.0.0,>=2.1.0
Using cached pendulum-2.1.2-cp37-cp37m-macosx_10_15_x86_64.whl (124 kB)
Collecting backoff<2.0,>=1.8.0
Downloading backoff-1.11.1-py2.py3-none-any.whl (13 kB)
Collecting sqlalchemy<2.0,>=1.4
Downloading SQLAlchemy-1.4.36-cp37-cp37m-macosx_10_14_x86_64.whl (1.5 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.5/1.5 MB 6.7 MB/s eta 0:00:00
Collecting jsonpath-ng<2.0.0,>=1.5.3
Downloading jsonpath_ng-1.5.3-py3-none-any.whl (29 kB)
Collecting pipelinewise-singer-python==1.2.0
Downloading pipelinewise_singer_python-1.2.0-py3-none-any.whl (24 kB)
Collecting click<9.0,>=8.0
Using cached click-8.1.3-py3-none-any.whl (96 kB)
Collecting pytz<2021.0
Downloading pytz-2020.5-py2.py3-none-any.whl (510 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 510.8/510.8 kB 7.2 MB/s eta 0:00:00
Collecting simplejson==3.11.1
Using cached simplejson-3.11.1.tar.gz (78 kB)
Preparing metadata (setup.py) ... done
Collecting python-dateutil>=2.6.0
Using cached python_dateutil-2.8.2-py2.py3-none-any.whl (247 kB)
Collecting backoff<2.0,>=1.8.0
Using cached backoff-1.8.0-py2.py3-none-any.whl (45 kB)
Collecting jsonschema==3.2.0
Using cached jsonschema-3.2.0-py2.py3-none-any.whl (56 kB)
Collecting ciso8601
Using cached ciso8601-2.2.0.tar.gz (18 kB)
Installing build dependencies ... done
Getting requirements to build wheel ... done
Preparing metadata (pyproject.toml) ... done
Collecting six>=1.11.0
Using cached six-1.16.0-py2.py3-none-any.whl (11 kB)
Requirement already satisfied: setuptools in ./venv/lib/python3.7/site-packages (from jsonschema==3.2.0->pipelinewise-singer-python==1.2.0->singer-sdk<0.5.0,>=0.4.4->tap-gitlab==2.0.0a3) (47.1.0)
Collecting pyrsistent>=0.14.0
Using cached pyrsistent-0.18.1-cp37-cp37m-macosx_10_9_x86_64.whl (68 kB)
Collecting typing_extensions
Using cached typing_extensions-4.2.0-py3-none-any.whl (24 kB)
Collecting cffi>=1.12
Using cached cffi-1.15.0-cp37-cp37m-macosx_10_9_x86_64.whl (178 kB)
Collecting decorator
Using cached decorator-5.1.1-py3-none-any.whl (9.1 kB)
Collecting ply
Using cached ply-3.11-py2.py3-none-any.whl (49 kB)
Collecting pytzdata>=2020.1
Using cached pytzdata-2020.1-py2.py3-none-any.whl (489 kB)
Collecting greenlet!=0.4.17
Using cached greenlet-1.1.2-cp37-cp37m-macosx_10_14_x86_64.whl (92 kB)
Collecting zipp>=0.5
Using cached zipp-3.8.0-py3-none-any.whl (5.4 kB)
Collecting pycparser
Using cached pycparser-2.21-py2.py3-none-any.whl (118 kB)
Using legacy 'setup.py install' for simplejson, since package 'wheel' is not installed.
Building wheels for collected packages: tap-gitlab, ciso8601
Building wheel for tap-gitlab (pyproject.toml) ... done
Created wheel for tap-gitlab: filename=tap_gitlab-2.0.0a3-py3-none-any.whl size=21146 sha256=6410d523f7367cf3786c4b206c5a64f2ede319ed4b03d5b909887d6a98ad0144
Stored in directory: /private/var/folders/gw/j33zqy31447crctz4n5ct4nm0000gp/T/pip-ephem-wheel-cache-a4uroqmw/wheels/5b/ca/99/3b7c339fc8f1f786201eff9b6308a99f4bb0ad88125e223682
Building wheel for ciso8601 (pyproject.toml) ... done
Created wheel for ciso8601: filename=ciso8601-2.2.0-cp37-cp37m-macosx_10_9_x86_64.whl size=13177 sha256=65b5f29ee3a084dd01bd451cd8b890160c18c487e931e97904160e369ea09d05
Stored in directory: /Users/amalkumar/Library/Caches/pip/wheels/ad/25/8f/3b0a82303191efe3c1204f3741c42d8eb2b0236567e22485de
Successfully built tap-gitlab ciso8601
Installing collected packages: simplejson, pytz, ply, ciso8601, appdirs, zipp, urllib3, typing_extensions, six, PyYAML, pytzdata, pyrsistent, PyJWT, pycparser, memoization, joblib, inflection, idna, greenlet, decorator, charset-normalizer, certifi, backoff, attrs, url-normalize, requests, python-dateutil, jsonpath-ng, importlib-metadata, cffi, cattrs, sqlalchemy, requests-cache, pendulum, jsonschema, cryptography, click, pipelinewise-singer-python, singer-sdk, tap-gitlab
Running setup.py install for simplejson ... done
Successfully installed PyJWT-2.4.0 PyYAML-6.0 appdirs-1.4.4 attrs-21.4.0 backoff-1.8.0 cattrs-1.10.0 certifi-2022.5.18.1 cffi-1.15.0 charset-normalizer-2.0.12 ciso8601-2.2.0 click-8.1.3 cryptography-3.4.8 decorator-5.1.1 greenlet-1.1.2 idna-3.3 importlib-metadata-4.11.4 inflection-0.5.1 joblib-1.1.0 jsonpath-ng-1.5.3 jsonschema-3.2.0 memoization-0.3.2 pendulum-2.1.2 pipelinewise-singer-python-1.2.0 ply-3.11 pycparser-2.21 pyrsistent-0.18.1 python-dateutil-2.8.2 pytz-2020.5 pytzdata-2020.1 requests-2.27.1 requests-cache-0.9.4 simplejson-3.11.1 singer-sdk-0.4.9 six-1.16.0 sqlalchemy-1.4.36 tap-gitlab-2.0.0a3 typing_extensions-4.2.0 url-normalize-1.4.3 urllib3-1.26.9 zipp-3.8.0

Error:
time=2022-05-30 11:11:38 name=tap-gitlab level=INFO message=Tap has custom mapper. Using 1 provided map(s).
{"type": "SCHEMA", "stream": "group_variables", "schema": {"properties": {"group_id": {"type": ["null", "integer"]}, "variable_type": {"type": ["null", "string"]}, "key": {"type": ["null", "string"]}, "value": {"type": ["null", "string"]}, "protected": {"type": ["null", "boolean"]}, "masked": {"type": ["null", "boolean"]}, "environment_scope": {"type": ["null", "string"]}}, "type": "object"}, "key_properties": ["project_id", "key"]}
time=2022-05-30 11:11:38 name=tap-gitlab level=INFO message=INFO METRIC: {'type': 'timer', 'metric': 'http_request_duration', 'value': 0.096695, 'tags': {'endpoint': '/groups/{group_id}/variables', 'http_status_code': 403, 'status': 'failed', 'url': '/api/v4/groups/71/variables', 'context': {'group_path': 'tech', 'group_id': 71}}}
Traceback (most recent call last):
File "/Users/amalkumar/venv/bin/tap-gitlab", line 8, in
sys.exit(TapGitLab.cli())
File "/Users/amalkumar/venv/lib/python3.7/site-packages/click/core.py", line 1130, in call
return self.main(*args, **kwargs)
File "/Users/amalkumar/venv/lib/python3.7/site-packages/click/core.py", line 1055, in main
rv = self.invoke(ctx)
File "/Users/amalkumar/venv/lib/python3.7/site-packages/click/core.py", line 1404, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/Users/amalkumar/venv/lib/python3.7/site-packages/click/core.py", line 760, in invoke
return __callback(*args, **kwargs)
File "/Users/amalkumar/venv/lib/python3.7/site-packages/singer_sdk/tap_base.py", line 499, in cli
tap.sync_all()
File "/Users/amalkumar/venv/lib/python3.7/site-packages/singer_sdk/tap_base.py", line 379, in sync_all
stream.sync()
File "/Users/amalkumar/venv/lib/python3.7/site-packages/singer_sdk/streams/core.py", line 1020, in sync
self._sync_records(context)
File "/Users/amalkumar/venv/lib/python3.7/site-packages/singer_sdk/streams/core.py", line 962, in _sync_records
self._sync_children(child_context)
File "/Users/amalkumar/venv/lib/python3.7/site-packages/singer_sdk/streams/core.py", line 1025, in _sync_children
child_stream.sync(context=child_context)
File "/Users/amalkumar/venv/lib/python3.7/site-packages/singer_sdk/streams/core.py", line 1020, in sync
self._sync_records(context)
File "/Users/amalkumar/venv/lib/python3.7/site-packages/singer_sdk/streams/core.py", line 946, in _sync_records
for record_result in self.get_records(current_context):
File "/Users/amalkumar/venv/lib/python3.7/site-packages/singer_sdk/streams/rest.py", line 424, in get_records
for record in self.request_records(context):
File "/Users/amalkumar/venv/lib/python3.7/site-packages/singer_sdk/streams/rest.py", line 322, in request_records
resp = decorated_request(prepared_request, context)
File "/Users/amalkumar/venv/lib/python3.7/site-packages/backoff/_sync.py", line 94, in retry
ret = target(*args, **kwargs)
File "/Users/amalkumar/venv/lib/python3.7/site-packages/singer_sdk/streams/rest.py", line 235, in _request
self.validate_response(response)
File "/Users/amalkumar/venv/lib/python3.7/site-packages/singer_sdk/streams/rest.py", line 165, in validate_response
raise FatalAPIError(msg)
singer_sdk.exceptions.FatalAPIError: 403 Client Error: Forbidden for path: /groups/{group_id}/variables

@laurentS
Copy link

Indeed! I believe this line

if stream_name in OPTIN_STREAM_NAMES and self.config.get(

should read as (note the not):

if stream_name in OPTIN_STREAM_NAMES and not self.config.get( 

Can you try this out and let me know if it solves your problem?

@amalkumarCurve
Copy link
Author

@laurentS
I Tried that and now getting this error:

Traceback (most recent call last):
File "/Users/amalkumar/venv/bin/tap-gitlab", line 8, in
sys.exit(TapGitLab.cli())
File "/Users/amalkumar/venv/lib/python3.7/site-packages/click/core.py", line 1130, in call
return self.main(*args, **kwargs)
File "/Users/amalkumar/venv/lib/python3.7/site-packages/click/core.py", line 1055, in main
rv = self.invoke(ctx)
File "/Users/amalkumar/venv/lib/python3.7/site-packages/click/core.py", line 1404, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/Users/amalkumar/venv/lib/python3.7/site-packages/click/core.py", line 760, in invoke
return __callback(*args, **kwargs)
File "/Users/amalkumar/venv/lib/python3.7/site-packages/singer_sdk/tap_base.py", line 499, in cli
tap.sync_all()
File "/Users/amalkumar/venv/lib/python3.7/site-packages/singer_sdk/tap_base.py", line 380, in sync_all
stream.finalize_state_progress_markers()
File "/Users/amalkumar/venv/lib/python3.7/site-packages/singer_sdk/streams/core.py", line 907, in finalize_state_progress_markers
child_stream.finalize_state_progress_markers()
File "/Users/amalkumar/venv/lib/python3.7/site-packages/singer_sdk/streams/core.py", line 907, in finalize_state_progress_markers
child_stream.finalize_state_progress_markers()
File "/Users/amalkumar/venv/lib/python3.7/site-packages/singer_sdk/streams/core.py", line 910, in finalize_state_progress_markers
for context in self.partitions or [{}]:
File "/Users/amalkumar/venv/lib/python3.7/site-packages/tap_gitlab/client.py", line 171, in partitions
"Could not detect partition type for Gitlab stream "
ValueError: Could not detect partition type for Gitlab stream 'epic_issues' (/groups/{group_id}/epics/{epic_iid}/issues). Expected a URL path containing '{project_path}' or '{group_path}'.

@aaronsteers
Copy link
Contributor

aaronsteers commented Jun 1, 2022

Related: I've found that the legacy version of this tap failed silently when access was denied on a number of stream types. I've started #78 which would give the new 2.x edition ability to ignore access denied issues when met.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants