-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Airbyte-ci: Add local cdk support #30461
Changes from all commits
9a86bf8
3402a84
2488fea
42621ce
1353db1
f9ef67c
5a22a2a
390f6b1
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -365,6 +365,25 @@ def with_python_connector_source(context: ConnectorContext) -> Container: | |
return with_python_package(context, testing_environment, connector_source_path) | ||
|
||
|
||
async def apply_python_development_overrides(context: ConnectorContext, connector_container: Container) -> Container: | ||
# Run the connector using the local cdk if flag is set | ||
if context.use_local_cdk: | ||
context.logger.info("Using local CDK") | ||
# mount the local cdk | ||
path_to_cdk = "airbyte-cdk/python/" | ||
directory_to_mount = context.get_repo_dir(path_to_cdk) | ||
|
||
context.logger.info(f"Mounting {directory_to_mount}") | ||
|
||
# Install the airbyte-cdk package from the local directory | ||
# We use --no-deps to avoid conflicts with the airbyte-cdk version required by the connector | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Won't this prevent us from being able to test the local CDK if it has any new/updated dependencies? For new dependencies, this will throw an exception, but for updates it'll be undefined (i.e. we won't necessarily be testing the code that'll be used in production). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @clnoll great question! and you are exactly right it will ignore sub-dependency conflicts that may crop up. e.g. if The alternative is that we do not include
The reason I did not choose this route was because
@clnoll more than happy to change the approach, what are your thoughts? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 👍 thanks! I want to take a closer look at the code and understand why we’re stuck with this tradeoff between --no-deps and extra steps, given that the original implementation of the local cdk flag didn’t have that tradeoff. This isn’t a super common scenario but it’s common enough that it doesn’t feel quite right to me to skip the deps. I hear you re the fact that conflicts would be caught before release, but I think that rooting out the issue early is important. Correct me if I'm wrong, but with the current implementation, to fix a dependency conflict I believe we’d have to toggle back and forth between updating the CDK and updating the connector until we settle on a version that agrees with both. Some conflicts could be resolved by doing that locally, but sometimes there are issues that will only crop up in certain development environments. i.e. you’ll see them in CI but not during local development, especially if it involves something like numpy or pandas that has a C dependency - and conflicts related to those libraries are the biggest sources of these problems that I've seen. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Ah whoops forgot to put a link to part of the why:
I can answer that! So the original implementation used This approach is both brittle and a victim of the python ecosystem where there are approximately 5 "standard" ways in 2023 to define your dependencies
Were currently planning on forcing python connectors to all move to the "pyproject.toml + poetry.lock" method but thats still loose plans and today were stuck with 5 possible ways that the To continue to use the However, because they all end up using
As the trade off is to either
🤔 I dont think you'd have to toggle back and forth. Let me explain Lets assume when your updating a connector to use a new version of the cdk you have this goal in mind
WIth this you'll likely target latest cdk.
I dont think a connector dev would be messing with cdk dependency versions @clnoll Is that a fair way to think about it?
Were hoping this method solves this very issue where There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks so much for the details, @bnchrch! I agree with your description of how a connector dev would be thinking about this situation. Re the CDK dev perspective - we do test changes on existing connectors and I could see us wanting to make sure that a package upgrade was appropriate for specific connectors, particularly if it's a popular connector, or if a CDK update was made with solving problems specific to those connectors. But, if we're confident that |
||
connector_container = connector_container.with_mounted_directory(f"/{path_to_cdk}", directory_to_mount).with_exec( | ||
["pip", "install", "--no-deps", f"/{path_to_cdk}"], skip_entrypoint=True | ||
) | ||
|
||
return connector_container | ||
|
||
|
||
async def with_python_connector_installed(context: ConnectorContext) -> Container: | ||
"""Install an airbyte connector python package in a testing environment. | ||
|
||
|
@@ -390,10 +409,14 @@ async def with_python_connector_installed(context: ConnectorContext) -> Containe | |
".dockerignore", | ||
] | ||
] | ||
return await with_installed_python_package( | ||
container = await with_installed_python_package( | ||
context, testing_environment, connector_source_path, additional_dependency_groups=["dev", "tests", "main"], exclude=exclude | ||
) | ||
|
||
container = await apply_python_development_overrides(context, container) | ||
|
||
return container | ||
|
||
|
||
async def with_ci_credentials(context: PipelineContext, gsm_secret: Secret) -> Container: | ||
"""Install the ci_credentials package in a python environment. | ||
|
@@ -858,11 +881,15 @@ async def with_airbyte_java_connector(context: ConnectorContext, connector_java_ | |
|
||
async def get_cdk_version_from_python_connector(python_connector: Container) -> Optional[str]: | ||
pip_freeze_stdout = await python_connector.with_entrypoint("pip").with_exec(["freeze"]).stdout() | ||
pip_dependencies = [dep.split("==") for dep in pip_freeze_stdout.split("\n")] | ||
for package_name, package_version in pip_dependencies: | ||
if package_name == "airbyte-cdk": | ||
return package_version | ||
return None | ||
cdk_dependency_line = next((line for line in pip_freeze_stdout.split("\n") if "airbyte-cdk" in line), None) | ||
if not cdk_dependency_line: | ||
return None | ||
|
||
if "file://" in cdk_dependency_line: | ||
return "LOCAL" | ||
|
||
_, cdk_version = cdk_dependency_line.split("==") | ||
return cdk_version | ||
|
||
|
||
async def with_airbyte_python_connector(context: ConnectorContext, build_platform: Platform) -> Container: | ||
|
@@ -876,8 +903,12 @@ async def with_airbyte_python_connector(context: ConnectorContext, build_platfor | |
.build(await context.get_connector_dir()) | ||
.with_label("io.airbyte.name", context.metadata["dockerRepository"]) | ||
) | ||
|
||
connector_container = await apply_python_development_overrides(context, connector_container) | ||
|
||
cdk_version = await get_cdk_version_from_python_connector(connector_container) | ||
if cdk_version: | ||
context.logger.info(f"Connector has a cdk dependency, using cdk version {cdk_version}") | ||
connector_container = connector_container.with_label("io.airbyte.cdk_version", cdk_version) | ||
context.cdk_version = cdk_version | ||
if not await connector_container.label("io.airbyte.version") == context.metadata["dockerImageTag"]: | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -4,7 +4,7 @@ build-backend = "poetry.core.masonry.api" | |
|
||
[tool.poetry] | ||
name = "pipelines" | ||
version = "1.2.3" | ||
version = "1.3.0" | ||
description = "Packaged maintained by the connector operations team to perform CI for connectors' pipelines" | ||
authors = ["Airbyte <[email protected]>"] | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd love to see an unit test on this. I'll implement one when I'll port it on top of #30474