Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Publish the response to process in the context of the record's transformation and filtering steps #193

Open
wants to merge 35 commits into
base: main
Choose a base branch
from

Conversation

rpopov
Copy link

@rpopov rpopov commented Dec 31, 2024

Feat: Publish the response to process in the context of the record's transformation and filtering steps
See:

In the declarative sources, after the record extraction, in the phase of record transformation and record filtering:

  • Bind the root object from which the record is extracted into the record itself to allow additional filed calculations.
  • The root object is:
    • the top-level JSON object, when the response is a single JSON object and the current record is nested in it;
    • the individual record object, when the response is a JSONL (file, CSV, etc) list or records
  • Bind the response to the $root ket, following the convention of $parameters object.
  • Establish the convention that any field name starting with "$" is a service field, visible only in the filtering and transformation phases, that should not be visible further in the process. Examples: $root, any other to be added in future, any added field using AddField transformation, holding local interim results)
  • Remove the service fields after the record transformation to avoid storing them.

Summary by CodeRabbit

  • New Features

    • Enhanced JSON decoding with improved error handling and logging.
    • Added support for preserving service keys during record extraction and transformation.
    • Improved record flattening with service key preservation.
    • Expanded test coverage for incremental state handling and service key management.
  • Bug Fixes

    • Refined pagination strategy to handle empty decoder responses.
    • Improved record selector and extractor logic for nested data structures.
  • Documentation

    • Updated contribution guidelines for clarity.
    • Refined release documentation for consistency.
  • Chores

    • Updated Prettier dependency to version 3.4.2.

and avoid presenting the empty JSON as an array of an empty object
…lic structures

There is an internal Flatten fransformation - let it skip the service fields
processing other than dict types. Fixed the end page parsing, when it is
empty, the token read defaults to {}
* local for the filter and transform steps
* not visible to the next mapping and store steps (as they should be)
* not visible in the tests beyond the test_record_selector (as they should be)

This allows the record transformation and filtering logic to define its
"local vcariables" in the format of service fields to reuse some interim
calculations.
… the

record selection (filteriung and transformation), as these fields became
local for the steps above.
Copy link
Contributor

coderabbitai bot commented Dec 31, 2024

📝 Walkthrough

Walkthrough

This pull request introduces comprehensive enhancements to the Airbyte CDK's declarative source components, focusing on improving error handling, record extraction, and service key management. The changes span multiple files across decoders, extractors, and transformations, introducing new utility functions for handling service keys, refining JSON parsing logic, and updating test suites to ensure robust functionality.

Changes

File Change Summary
airbyte_cdk/sources/declarative/decoders/json_decoder.py Enhanced error logging and simplified JSON parsing logic
airbyte_cdk/sources/declarative/extractors/dpath_extractor.py Added RECORD_ROOT_KEY and update_record function to preserve root response context
airbyte_cdk/sources/declarative/extractors/record_extractor.py Introduced service key management functions and SERVICE_KEY_PREFIX constant
airbyte_cdk/sources/declarative/extractors/record_selector.py Added method to remove service keys from records
airbyte_cdk/sources/declarative/requesters/paginators/strategies/cursor_pagination_strategy.py Improved error handling in next_page_token method
airbyte_cdk/sources/declarative/transformations/flatten_fields.py Modified to preserve service keys during record flattening
Multiple test files Updated test cases to cover new functionality and edge cases

Suggested labels

enhancement

Suggested reviewers

  • aaronsteers
  • maxi297
  • ChristoGrab

Hey there! 👋 I noticed some interesting improvements in this PR. Would you like me to elaborate on any specific changes? The service key handling and error logging enhancements look particularly neat. Wdyt?

The modifications seem to make the declarative source components more robust and flexible. Is there anything specific you'd like me to dive deeper into? 🕵️‍♀️

Finishing Touches

  • 📝 Generate Docstrings (Beta)

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

‼️ IMPORTANT
Auto-reply has been disabled for this repository in the CodeRabbit settings. The CodeRabbit bot will not respond to your replies unless it is explicitly tagged.

  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (31)
airbyte_cdk/sources/declarative/extractors/record_extractor.py (4)

11-16: Neat introduction of service key convention!
This clearly explains why a prefix is needed to segregate service keys. Would you consider adding a brief usage example in a docstring, wdyt?


22-26: In-place removal of service keys
This function effectively mutates the original dictionary. Would you consider highlighting the mutating nature in the function name, for instance remove_service_keys_in_place, wdyt?


28-30: Helper function is_service_key
The prefix check is straightforward. One minor note: using startswith(SERVICE_KEY_PREFIX) might be more expressive. wdyt?


32-33: Assertion for presence of service keys
This ensures that at least one service key remains in the mapping, which is helpful for validating transformations. Would you consider returning a more descriptive error message to clarify which key is missing, wdyt?

airbyte_cdk/sources/declarative/transformations/flatten_fields.py (2)

8-8: Usage of is_service_key
Nice import to preserve the logic in a single place. Would you consider re-exporting is_service_key from a central file so other modules also import it consistently, wdyt?


36-44: Preserving service keys
By skipping flattening for service keys, you efficiently maintain their structure. Could we add a comment to describe why we early-return for service keys, wdyt?

unit_tests/sources/declarative/transformations/test_flatten_fields.py (1)

50-58: Test preserving service key with a string value
This sequence effectively checks that the prefixed field remains unflattened. Maybe we can add an additional assertion verifying the shape of the resultant dictionary for completeness, wdyt?

unit_tests/sources/declarative/extractors/test_record_extractor.py (2)

1-12: New test file creation
Good to see a dedicated test suite for service key handling. Maybe consider adding docstrings describing the purpose behind each test function, wdyt?


30-44: Test remove_service_keys
Great approach for ensuring in-place key removal. Possibly add a scenario with nested dictionaries to confirm behavior, wdyt?

airbyte_cdk/sources/declarative/decoders/json_decoder.py (2)

38-41: Graceful handling of JSON decode failure
Catching JSONDecodeError and yielding from an empty list is a smooth approach to avoid breaking the flow. Would you be open to adding a short comment clarifying that we skip all records if parsing fails, wdyt?


47-50: Flexible handling of list vs. single object
Yielding directly from the object if it’s a list, or wrapping it into a list otherwise, streamlines the logic. Looks good. Would you consider adding a small docstring example clarifying the single-object vs. multiple-object scenario, wdyt?

airbyte_cdk/sources/declarative/requesters/paginators/strategies/cursor_pagination_strategy.py (1)

65-65: Fallback for decoder results
Using next(self.decoder.decode(response), {}) prevents StopIteration if nothing is yielded, which is handy. Do we want to log a warning if this default kicks in, wdyt?

airbyte_cdk/sources/declarative/extractors/dpath_extractor.py (3)

19-21: Establishing a consistent root key
Defining RECORD_ROOT_KEY with SERVICE_KEY_PREFIX + "root" is a neat pattern. Any chance we might rename it to __root__ or something similarly distinctive, wdyt?


88-88: Preserving the complete body as root
Storing body in root_response is a convenient handle for later usage. Perhaps we might add a comment explaining its role in subsequent transformations, wdyt?


98-100: Adding root context to each record
Using update_record(record, root_response) for items in the extracted list is great for easy transformations. Would you consider a quick doc mention that each record now includes the $root data, wdyt?

unit_tests/sources/declarative/decoders/test_json_decoder.py (2)

15-18: Model and factory imports
Importing DeclarativeStream and ModelToComponentFactory might be part of a broader test suite. Should we add a small integration test showcasing these, wdyt?


22-27: Testing empty and non-empty JSON
The parameterization ensures coverage for empty and valid JSON, which is great. Maybe add a test case for malformed JSON to confirm we yield an empty list, wdyt?

airbyte_cdk/sources/declarative/extractors/record_selector.py (1)

114-115: Great approach filtering service keys before normalization!
By removing these keys early, you avoid persisting unnecessary or sensitive fields in subsequent transformations. Would you consider adding a small inline comment explaining that this step is purely for service key removal, so new contributors understand the rationale? Wdyt?

unit_tests/sources/declarative/extractors/test_dpath_extractor.py (5)

45-46: Well-structured nested array test coverage!
Testing with a single record containing nested entries is helpful. Wdyt about adding checks for uniquely typed nested fields?


47-51: Extended nested structure test ensures correctness.
It’s great for verifying deeper nesting. Should we also test a scenario with partial missing nested fields? Wdyt?


58-63: Valid approach for repeated testing of deeply nested arrays!
Everything is consistent with previous param sets. Would you like to verify any type conversions here, or is that out of scope? Wdyt?


66-71: Check for single-field nested extraction is well-defined!
No issues here. Possibly consider negative tests (e.g., absent ‘id’ key). Wdyt?


169-170: Ensuring nested extraction with JSONL lines is validated.
Are you comfortable with the coverage? Possibly add an invalid JSON line test? Wdyt?

unit_tests/sources/declarative/extractors/test_record_selector.py (4)

13-16: Adding exclude_service_keys and verify_service_keys_exist is consistent with the rest of the suite.
This promotes uniform logic. Wdyt about also testing default behavior (no service keys found)?


47-60: New test for returning all records with nested data.
It’s well-defined. Would you consider adding a partial-nesting scenario for completeness? Wdyt?


62-75: Testing {{True}} filter with nested data.
Covers a universal pass scenario effectively. Possibly try a trivial fail scenario too?


166-168: Applying exclude_service_keys after record selection is correct.
This ensures the final output is free of service fields. Maybe mention that in code comments, wdyt?

unit_tests/sources/declarative/retrievers/test_simple_retriever.py (1)

242-242: New assertion checking final state for full refresh sync completion.
This is excellent for clarity. Consider adding a note about possible partial state merges? Wdyt?

unit_tests/sources/declarative/test_manifest_declarative_source.py (1)

1764-1764: Add a descriptive message in the assertion? Since this test checks if the discovered records match the expected ones, do you want to include a clarifying message in the assert to help diagnose failures more quickly? For example, assert output_data == expected_records, f"Mismatch in {test_name}...". wdyt?

unit_tests/sources/declarative/partition_routers/test_parent_state_stream.py (2)

325-327: Use a more descriptive variable name for clarity?
Currently, records_from_state might benefit from a more explicit name that conveys its content and purpose, such as records_read_from_intermediate_state. Wdyt?


Line range hint 331-333: Consider an alternative deduplication approach for large data sets?
Here, converting records to a JSON string and using them as dictionary keys might be memory-intensive for large volumes of data. Would you consider using a more straightforward approach (e.g., hashing tuples of normalized content) to reduce overhead? Wdyt?

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f8cb659 and e693c94.

📒 Files selected for processing (14)
  • airbyte_cdk/sources/declarative/decoders/json_decoder.py (1 hunks)
  • airbyte_cdk/sources/declarative/extractors/dpath_extractor.py (3 hunks)
  • airbyte_cdk/sources/declarative/extractors/record_extractor.py (1 hunks)
  • airbyte_cdk/sources/declarative/extractors/record_selector.py (3 hunks)
  • airbyte_cdk/sources/declarative/requesters/paginators/strategies/cursor_pagination_strategy.py (1 hunks)
  • airbyte_cdk/sources/declarative/transformations/flatten_fields.py (2 hunks)
  • unit_tests/sources/declarative/decoders/test_json_decoder.py (1 hunks)
  • unit_tests/sources/declarative/extractors/test_dpath_extractor.py (5 hunks)
  • unit_tests/sources/declarative/extractors/test_record_extractor.py (1 hunks)
  • unit_tests/sources/declarative/extractors/test_record_selector.py (5 hunks)
  • unit_tests/sources/declarative/partition_routers/test_parent_state_stream.py (1 hunks)
  • unit_tests/sources/declarative/retrievers/test_simple_retriever.py (2 hunks)
  • unit_tests/sources/declarative/test_manifest_declarative_source.py (1 hunks)
  • unit_tests/sources/declarative/transformations/test_flatten_fields.py (2 hunks)
🔇 Additional comments (32)
airbyte_cdk/sources/declarative/extractors/record_extractor.py (1)

18-19: Function exclude_service_keys
Great approach to returning a filtered mapping that excludes service keys. Would you consider clarifying in the docstring that this function doesn't mutate the original mapping, wdyt?

unit_tests/sources/declarative/transformations/test_flatten_fields.py (2)

7-7: Import of SERVICE_KEY_PREFIX
Bringing in the constant aids test readability. Looks great!


59-67: Set usage with service key
Admirable to test sets alongside lists. Everything looks good. Perhaps you could add a case with nested service keys in the future, wdyt?

unit_tests/sources/declarative/extractors/test_record_extractor.py (4)

15-28: Test exclude_service_keys
Testing multiple scenarios is helpful. This thoroughness is great for coverage!


46-56: Check presence of service keys
The parameterized approach covers a variety of input shapes. Looks comprehensive!


58-67: Handling the absence of service keys
Clever usage of try/except to confirm the assertion is raised. No further issues here.


69-74: Function test_service_field and test_regular_field
Straightforward tests confirming the logic behind service and non-service fields. This is quite clear and concise!

airbyte_cdk/sources/declarative/extractors/dpath_extractor.py (3)

12-15: Imports for service keys
These additions ensure we can properly reference the SERVICE_KEY_PREFIX. Everything looks in order. Feeling confident about it, wdyt?


23-30: Utility function for record injection
update_record effectively injects the root data when record is a dict. Would you consider logging or handling non-dict cases differently instead of passing them through unchanged, wdyt?


101-101: Single object extraction
Similarly, returning the updated single object ensures consistency. LGTM. Maybe just confirm there's no performance overhead from the extra dictionary copy, wdyt?

unit_tests/sources/declarative/decoders/test_json_decoder.py (4)

11-12: New imports for testing
Introducing YamlDeclarativeSource and SyncMode could indicate expanded scenarios. Are we using them in these tests, or do we plan to, wdyt?


30-33: Verifying empty string yields empty list
This aligns with the new JsonDecoder behavior. Kudos for thorough testing. Good to go!


40-41: jsonl decoder edge cases
Checking an empty object returns [{}] and an empty list returns [[]] is interesting. Is that the desired behavior, or should empty lists remain empty, wdyt?


48-54: Clearer parameter IDs
The new ids for parameterized tests clarifies each scenario well. This is helpful in test logs. Good job!

airbyte_cdk/sources/declarative/extractors/record_selector.py (2)

11-14: Imports from record_extractor look good!
Nice job importing exclude_service_keys here. This keeps validation logic centralized. Wdyt about also adding a short docstring to indicate why it’s being imported?


164-169: Helper method _remove_service_keys is concise and cohesive!
It clearly separates the logic for removing service keys from the main flow. The name is self-explanatory, which is great. Wdyt about possibly adding a quick docstring snippet describing usage?

unit_tests/sources/declarative/extractors/test_dpath_extractor.py (13)

18-21: LGTM on the new imports!
Bringing in exclude_service_keys and verify_service_keys_exist ensures consistent testing across your modules. Wdyt about clarifying in code comments how each function is used in these tests?


40-41: Empty string handling test looks good!
It’s valuable to confirm no records are returned for empty input. Everything seems fine here.


52-57: Multiple-record nested scenario coverage is clear!
No immediate issues. Maybe consider adding a note on expected behavior if partially malformed JSON arises?


64-65: Single field extraction from root object tested.
This single-case approach is straightforward. Looks good to me!


72-77: Similar patterned test for multiple nested fields.
Maintaining consistency is a good practice. Thanks for adding clarity.


78-83: Handling multiple dictionaries at the same depth.
All good. The param definitions are descriptive.


84-94: Thorough test with an even bigger nested structure.
Thanks for covering that scenario. Seems robust!


126-127: JSONL extraction tests extended for empty data.
That ensures no unexpected defaults appear. Good coverage!


156-160: Test names are descriptive!
Naming them like “test_extract_from_empty_string” is helpful. This improves maintainability.


161-164: Continuing the pattern with param-based test IDs.
All consistent. This is a neat approach.


167-168: Confirmation that JSONL extraction matches single array.
Looks great; no changes needed.


177-178: Handles multiline JSONL with text fields.
Yes, this is crucial to ensure newlines are handled properly.


196-200: Service key management verification is great!
This final step cements correctness. Encouraging thorough coverage.

unit_tests/sources/declarative/extractors/test_record_selector.py (1)

111-123: Verifies use of $root in filter conditions.
This is a valuable scenario, capturing references to the parent. Great job!

unit_tests/sources/declarative/retrievers/test_simple_retriever.py (1)

360-360: Ensuring __ab_full_refresh_sync_complete is set properly.
This line addresses the correct final state. LGTM!

unit_tests/sources/declarative/partition_routers/test_parent_state_stream.py (1)

Line range hint 335-341: Double-check your sorting logic for correctness?
Using a string serialization for sorting is perfectly acceptable in tests. However, if the order of keys changes or if floating-point precision is altered, it might trigger unexpected results. Did you consider any possible corner cases with sorting by JSON representation? Wdyt?

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (4)
airbyte_cdk/sources/declarative/extractors/dpath_extractor.py (1)

23-29: Consider adding type hints for better code clarity, wdyt?

The function looks good! Would you consider adding more specific type hints to improve code clarity? For example:

-def update_record(record: Any, root: Any) -> Any:
+def update_record(record: Union[dict, Any], root: Union[dict, list, Any]) -> Union[dict, Any]:
docs/CONTRIBUTING.md (1)

43-44: Consider rephrasing to avoid repetition, wdyt?

The instructions are clear, but there's repetition of "Run" at the start of consecutive lines. Maybe we could rephrase like this?

-Run `poetry run ruff format` to format your Python code.
-Run `poetry run ruff check .` to report the not fixed issues. Fix them manually.
+Use `poetry run ruff format` to format your Python code.
+Then `poetry run ruff check .` to report any remaining issues. Fix these manually.
🧰 Tools
🪛 LanguageTool

[style] ~44-~44: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...f formatto format your Python code. - Runpoetry run ruff check .` to report the...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)

docs/RELEASES.md (2)

52-52: Consider creating a GitHub issue for the TODO.

This TODO seems important for improving the release process visibility. Would you like me to create a GitHub issue to track this improvement?


60-64: Consider improving the formatting of the instructions.

The instructions are clear but could be more readable with better formatting. What do you think about this structure?

-Once the publish pipeline has completed, choose a connector to test. Set the base_image in the connector's metadata to your pre-release version in Dockerhub (make sure to update the SHA as well).
-Next, build the pre-release image locally using `airbyte-ci connectors —name=<source> build`.
-You can now run connector interfaces against the built image using the pattern `docker run airbyte/<source-name>:dev <spec/check/discover/read>`.
-The connector's README should include a list of these commands, which can be copy/pasted and run from the connector's directory for quick testing against a local config.
-You can also run `airbyte-ci connectors —name=<source> test` to run the CI test suite against the dev image.
+1. Choose a connector to test after the publish pipeline completes
+2. Update the connector's metadata:
+   - Set base_image to your pre-release version in Dockerhub
+   - Update the SHA accordingly
+3. Build the pre-release image locally:
+   ```bash
+   airbyte-ci connectors —name=<source> build
+   ```
+4. Run connector interfaces against the built image:
+   ```bash
+   docker run airbyte/<source-name>:dev <spec/check/discover/read>
+   ```
+   Note: The connector's README includes these commands for quick testing.
+5. Optionally run the CI test suite:
+   ```bash
+   airbyte-ci connectors —name=<source> test
+   ```
🧰 Tools
🪛 LanguageTool

[grammar] ~60-~60: Did you mean the noun “publishing”?
Context: ...ting Manifest-Only connectors Once the publish pipeline has completed, choose a connec...

(PREPOSITION_VERB)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e693c94 and daaea38.

⛔ Files ignored due to path filters (1)
  • package-lock.json is excluded by !**/package-lock.json
📒 Files selected for processing (6)
  • airbyte_cdk/sources/declarative/extractors/dpath_extractor.py (3 hunks)
  • airbyte_cdk/sources/declarative/extractors/record_extractor.py (1 hunks)
  • docs/CONTRIBUTING.md (1 hunks)
  • docs/RELEASES.md (2 hunks)
  • package.json (1 hunks)
  • unit_tests/sources/declarative/extractors/test_record_extractor.py (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • package.json
🚧 Files skipped from review as they are similar to previous changes (2)
  • airbyte_cdk/sources/declarative/extractors/record_extractor.py
  • unit_tests/sources/declarative/extractors/test_record_extractor.py
🧰 Additional context used
🪛 LanguageTool
docs/CONTRIBUTING.md

[style] ~44-~44: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...f formatto format your Python code. - Runpoetry run ruff check .` to report the...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)

docs/RELEASES.md

[uncategorized] ~56-~56: Possible missing article found.
Context: ...lly test changes against a dev image of SDM before committing to a release, first u...

(AI_HYDRA_LEO_MISSING_THE)


[grammar] ~60-~60: Did you mean the noun “publishing”?
Context: ...ting Manifest-Only connectors Once the publish pipeline has completed, choose a connec...

(PREPOSITION_VERB)


[grammar] ~68-~68: Did you mean the noun “publishing”?
Context: ...ng Low-Code Python connectors Once the publish pipeline has completed, set the version...

(PREPOSITION_VERB)

🔇 Additional comments (2)
airbyte_cdk/sources/declarative/extractors/dpath_extractor.py (2)

19-20: LGTM! Clear and descriptive constant definition.

The constant follows the established naming convention and its purpose is well-documented in the comment.


Line range hint 88-101: Verify the memory impact of storing the root response.

The changes look good! Just want to ensure we've considered the memory implications:

  1. The root response is now stored in each record
  2. For large responses, this could increase memory usage significantly

Also, would you consider adding a comment explaining why we're storing the root response? This would help future maintainers understand the design decision. 🤔

✅ Verification successful

Let me search for test cases and actual usage patterns to better understand the memory implications. The previous command might have failed because the path pattern wasn't specific enough.


Memory impact looks reasonable based on test cases 🌟

Looking at the test cases, I can see that the responses being handled are relatively small (mostly containing simple records with IDs and nested objects). The update_record function creates a shallow copy of the dictionary and only stores a reference to the root response, not a deep copy.

A few observations that make me feel confident about the memory usage:

  • Most test cases work with small JSON payloads
  • The root response is stored by reference, not by copying
  • The pattern is used consistently across all test cases without any reported issues

Would you consider adding a quick comment explaining that we're storing a reference to the root response for downstream processing? Something like:

# Store reference to root response to allow downstream processors access to the full context
root_response = body

wdyt? 🤔

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for typical response sizes in tests
rg -A 5 "def test.*extract_records" unit_tests/

Length of output: 49


Script:

#!/bin/bash
# Let's check the test files specifically for the DPath extractor
fd "test.*dpath.*\.py" unit_tests/ --exec cat {}

# Also check the actual usage of update_record to understand the memory implications
rg -B 3 -A 3 "update_record" 

Length of output: 9434

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (4)
airbyte_cdk/sources/declarative/extractors/record_extractor.py (3)

17-18: Consider adding type hints for better code clarity, wdyt?

The function looks good, but we could make it more explicit about the return type being a new mapping rather than modifying the input.

-def exclude_service_keys(mapping: Mapping[str, Any]) -> Mapping[str, Any]:
+def exclude_service_keys(mapping: Mapping[str, Any]) -> dict[str, Any]:

21-24: Consider adding return type annotation for consistency.

The function modifies the dict in place, which is clear from the implementation. Would you consider adding a return type annotation of None to make this explicit?

-def remove_service_keys(mapping: dict[str, Any]):  # type: ignore[no-untyped-def]
+def remove_service_keys(mapping: dict[str, Any]) -> None:

31-32: Fix typo in assertion message.

There's a small typo in the assertion message.

-    assert mapping != exclude_service_keys(mapping), "Expected service are present"
+    assert mapping != exclude_service_keys(mapping), "Expected service keys are present"
docs/CONTRIBUTING.md (1)

43-46: Consider restructuring commands for better readability, wdyt?

The commands are clear but start with repetitive "Run". Maybe we could make it more concise and readable?

-# Run `poetry run ruff format` to format your Python code.
-# Run `poetry run ruff check --fix` to fix the formatting issues.
-# Run `poetry run ruff check .` to report the not fixed issues. Fix them manually.
-# Run `poetry run mypy --config-file mypy.ini airbyte_cdk` to validate the code. Resolve the reported issues.
+Format and validate your code using these commands:
+
+1. `poetry run ruff format` - Format Python code
+2. `poetry run ruff check --fix` - Auto-fix formatting issues
+3. `poetry run ruff check .` - Report remaining issues (fix manually)
+4. `poetry run mypy --config-file mypy.ini airbyte_cdk` - Validate type annotations
🧰 Tools
🪛 LanguageTool

[style] ~44-~44: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...f formatto format your Python code. - Runpoetry run ruff check --fix` to fix th...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


[style] ~45-~45: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ... --fixto fix the formatting issues. - Runpoetry run ruff check .` to report the...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between daaea38 and 9a05c94.

📒 Files selected for processing (2)
  • airbyte_cdk/sources/declarative/extractors/record_extractor.py (1 hunks)
  • docs/CONTRIBUTING.md (1 hunks)
🧰 Additional context used
🪛 LanguageTool
docs/CONTRIBUTING.md

[style] ~44-~44: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...f formatto format your Python code. - Runpoetry run ruff check --fix` to fix th...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


[style] ~45-~45: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ... --fixto fix the formatting issues. - Runpoetry run ruff check .` to report the...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)

🔇 Additional comments (1)
airbyte_cdk/sources/declarative/extractors/record_extractor.py (1)

10-14: LGTM! Clear documentation of service field conventions.

The documentation clearly explains the convention for service fields and their prefix. This aligns well with the PR objective of binding the response root object to records.

Minor edit to trigger the reevaluation of the integration tests.
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
docs/RELEASES.md (2)

60-61: How about making the testing instructions more explicit? 🤔

The instructions are good, but what do you think about these small improvements to make them even clearer?

  1. Replace angle brackets with concrete examples:
-airbyte-ci connectors —name=<source> build
+airbyte-ci connectors —name=source-example build  # Replace 'source-example' with your connector name
  1. Add a note about configuration requirements:
 You can now run connector interfaces against the built image using the pattern `docker run airbyte/<source-name>:dev <spec/check/discover/read>`.
+# Make sure you have a valid config.json file in your current directory for check/discover/read commands

wdyt? 🙂

Also applies to: 68-68


52-52: Would you like me to help track this TODO? 🎯

This is a good suggestion about improving version inspection without requiring prod DB access. Would you like me to create a GitHub issue to track this improvement?

I can help create an issue with:

  • Title: "Improve SDM version inspection without prod DB access"
  • Description detailing the current limitations and desired outcome
  • Labels: documentation, enhancement

Let me know if you'd like me to proceed! 😊

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9a05c94 and 6587873.

📒 Files selected for processing (1)
  • docs/RELEASES.md (2 hunks)
🧰 Additional context used
🪛 LanguageTool
docs/RELEASES.md

[uncategorized] ~56-~56: Possible missing article found.
Context: ...lly test changes against a dev image of SDM before committing to a release, first u...

(AI_HYDRA_LEO_MISSING_THE)


[grammar] ~68-~68: Did you mean the noun “publishing”?
Context: ...ng Low-Code Python connectors Once the publish pipeline has completed, set the version...

(PREPOSITION_VERB)

🔇 Additional comments (1)
docs/RELEASES.md (1)

14-14: LGTM! Nice formatting improvement! 👍

The consistent formatting with italics makes the notes section more readable.

@rpopov
Copy link
Author

rpopov commented Jan 1, 2025

The acceptance test failures are random and unrelated to the actual change.
What is the practice in this case?

Update CONTRIBUTING.md to clarify and make explicit the procedures and steps.
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (4)
docs/CONTRIBUTING.md (4)

17-18: The Poetry installation instructions could be more specific, wdyt?

Consider adding the recommended installation method for different operating systems to help new contributors get started quickly. For example:

1. Install Poetry:
   - macOS: `curl -sSL https://install.python-poetry.org | python3 -`
   - Windows: `(Invoke-WebRequest -Uri https://install.python-poetry.org -UseBasicParsing).Content | py -`
   - Linux: `curl -sSL https://install.python-poetry.org | python3 -`

26-28: Consider restructuring the test commands section for better readability, wdyt?

The commands are clear but could be organized better to avoid repetitive sentence starts. Here's a suggestion:

### Running Unit Tests

Use these commands to run tests:
- All tests: `poetry run pytest`
- Specific tests: `poetry run pytest -k <suite or test name>`
- Fast tests only (<5 min): `poetry run pytest-fast`
🧰 Tools
🪛 LanguageTool

[style] ~28-~28: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ... unit tests. - poetry run pytest-fast to run the subset of PyTest tests which ar...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


59-74: Unify the list style in the assumptions and steps sections

The static analysis tool flagged inconsistent list markers (asterisks vs. dashes). Consider using dashes consistently throughout the document for better maintainability:

 **Assumptions:**
-* The connector to test with is in the root [Airbyte project](https://github.com/airbytehq/airbyte).
-* The [Airbyte project](https://github.com/airbytehq/airbyte) is checked out in `airbyte` directory.
+- The connector to test with is in the root [Airbyte project](https://github.com/airbytehq/airbyte).
+- The [Airbyte project](https://github.com/airbytehq/airbyte) is checked out in `airbyte` directory.
🧰 Tools
🪛 Markdownlint (0.37.0)

59-59: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


60-60: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


61-61: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


65-65: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


66-66: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


72-72: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


73-73: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


74-74: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


174-174: Consider adding a note about Docker networking

Since the section discusses running tests in Docker, it might be helpful to add a note about potential networking issues and how to resolve them, similar to the MockServer section that mentions using host.docker.internal. What do you think?

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6587873 and a33f815.

📒 Files selected for processing (1)
  • docs/CONTRIBUTING.md (2 hunks)
🧰 Additional context used
🪛 LanguageTool
docs/CONTRIBUTING.md

[style] ~28-~28: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ... unit tests. - poetry run pytest-fast to run the subset of PyTest tests which ar...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


[style] ~35-~35: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...ting issues. - poetry run ruff format to format your Python code. - `poetry run ...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


[style] ~36-~36: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...hon code. - poetry run poe format-fix to auto-fix formatting issues. ### Run Co...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)

🪛 Markdownlint (0.37.0)
docs/CONTRIBUTING.md

59-59: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


60-60: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


61-61: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


65-65: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


66-66: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


72-72: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


73-73: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


74-74: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


82-82: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)

docs/CONTRIBUTING.md Show resolved Hide resolved
Update CONTRIBUTING.md to state the actual testing steps.

The integration tests of the CDK fail due to missing `secrets/config.json` file for the Shopify source.
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (4)
docs/CONTRIBUTING.md (4)

26-28: Consider restructuring the test commands for better readability?

The test commands are accurate but could be more readable. What do you think about grouping them with bullet points and adding brief descriptions for each command? Something like:

- `poetry run pytest` to run all unit tests.
- `poetry run pytest -k <suite or test name>` to run specific unit tests.
- `poetry run pytest-fast` to run the subset of PyTest tests which are not flagged as `slow`. (Should take <5 min for fast tests only.)
+ - Run all tests: `poetry run pytest`
+ - Run specific tests: `poetry run pytest -k <suite or test name>`
+ - Run fast tests only (~5 min): `poetry run pytest-fast`

wdyt? 🤔

🧰 Tools
🪛 LanguageTool

[style] ~28-~28: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ... unit tests. - poetry run pytest-fast to run the subset of PyTest tests which ar...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


58-61: Consider standardizing the bullet points style?

The unordered list uses asterisks (*) while other lists in the document use dashes (-). Would you like to standardize this for consistency?

-* The connector to test with is in the root [Airbyte project](https://github.com/airbytehq/airbyte).
-* The [Airbyte project](https://github.com/airbytehq/airbyte) is checked out in `airbyte` directory.
-* The CDK development project is checked out in the `aibyte/airbyte-python-cdk` subdirectory of the root [Airbyte project](https://github.com/airbytehq/airbyte).
+- The connector to test with is in the root [Airbyte project](https://github.com/airbytehq/airbyte).
+- The [Airbyte project](https://github.com/airbytehq/airbyte) is checked out in `airbyte` directory.
+- The CDK development project is checked out in the `aibyte/airbyte-python-cdk` subdirectory of the root [Airbyte project](https://github.com/airbytehq/airbyte).
🧰 Tools
🪛 Markdownlint (0.37.0)

59-59: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


60-60: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


61-61: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


74-79: Add language specifiers to code blocks?

The code blocks would be more readable with syntax highlighting. Should we add language specifiers? For example:

-```
+```bash
poetry run <connector name> spec
poetry run <connector name> check --config secrets/config.json
...

Also applies to: 81-86

🧰 Tools
🪛 Markdownlint (0.37.0)

74-74: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


187-187: Consider adding a note about Docker prerequisites?

The section about running connector acceptance tests assumes Docker knowledge. Would it be helpful to add a note about Docker requirements or link to Docker setup instructions? wdyt? 🤔

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a33f815 and 4bc243d.

📒 Files selected for processing (1)
  • docs/CONTRIBUTING.md (2 hunks)
🧰 Additional context used
🪛 LanguageTool
docs/CONTRIBUTING.md

[style] ~28-~28: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ... unit tests. - poetry run pytest-fast to run the subset of PyTest tests which ar...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


[style] ~35-~35: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...ting issues. - poetry run ruff format to format your Python code. - `poetry run ...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


[style] ~36-~36: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...hon code. - poetry run poe format-fix to auto-fix formatting issues. ### Run Co...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)

🪛 Markdownlint (0.37.0)
docs/CONTRIBUTING.md

59-59: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


60-60: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


61-61: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


65-65: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


66-66: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


72-72: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


73-73: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


87-87: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


95-95: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


102-102: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


74-74: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


81-81: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🔇 Additional comments (2)
docs/CONTRIBUTING.md (2)

17-18: LGTM! Clear installation instructions.

The setup instructions are clear and concise, properly directing users to install Poetry and its dependencies.


97-100: Fix the command in the example

The command example shows build but the context suggests it should be test:

-airbyte-ci connectors --use-local-cdk --name=<CONNECTOR> build
+airbyte-ci connectors --use-local-cdk --name=<CONNECTOR> test

Trying to collect and organize the knowledge on running the test locally.
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (6)
docs/CONTRIBUTING.md (6)

26-28: Consider improving command descriptions for clarity

The commands look good, but we could make them more descriptive. What do you think about adding brief descriptions of what each command does? For example:

-`poetry run pytest` to run all unit tests.
-`poetry run pytest -k <suite or test name>` to run specific unit tests.
-`poetry run pytest-fast` to run the subset of PyTest tests which are not flagged as `slow`.
+`poetry run pytest` - Executes the complete test suite
+`poetry run pytest -k <suite or test name>` - Runs tests matching the specified pattern
+`poetry run pytest-fast` - Executes only fast tests (not marked as `slow`, completes in <5 min)
🧰 Tools
🪛 LanguageTool

[style] ~28-2828: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ... unit tests. - poetry run pytest-fast to run the subset of PyTest tests which ar...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


58-82: Consider restructuring the installation instructions for better flow

The installation steps could be more streamlined. What do you think about this structure:

  1. Move the assumptions section after the preparation steps
  2. Combine the path example with the directory change instruction
  3. Add a note about why we're editing pyproject.toml

Also, the code block at line 68-71 seems out of place - should it be moved to a different section? wdyt?

🧰 Tools
🪛 Markdownlint (0.37.0)

59-5959: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


60-6060: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


61-6161: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


64-6464: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


65-6565: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


66-6666: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


73-7373: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


75-7575: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


76-7676: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


82-8282: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


63-6363: null
Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)


68-6868: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


Line range hint 213-224: Consolidate duplicate test instructions

This section appears to duplicate the information from lines 115-134. Should we combine these sections to avoid confusion? We could keep one comprehensive section about running tests with the local CDK. wdyt?


59-76: Fix markdown list formatting

The unordered lists are using mixed styles (asterisks and dashes). For consistency with the rest of the document, should we standardize on dashes? Example:

-* The connector to test with is in the root [Airbyte project](https://github.com/airbytehq/airbyte).
-* The [Airbyte project](https://github.com/airbytehq/airbyte) is checked out in `airbyte` directory.
+- The connector to test with is in the root [Airbyte project](https://github.com/airbytehq/airbyte).
+- The [Airbyte project](https://github.com/airbytehq/airbyte) is checked out in `airbyte` directory.
🧰 Tools
🪛 Markdownlint (0.37.0)

59-5959: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


60-6060: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


61-6161: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


64-6464: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


65-6565: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


66-6666: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


73-7373: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


75-7575: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


76-7676: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


63-6363: null
Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)


68-6868: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


68-68: Add language specifiers to code blocks

The fenced code blocks are missing language specifiers. Should we add them for better syntax highlighting? For example:

-```
+```bash

Also applies to: 87-87, 97-97, 107-107

🧰 Tools
🪛 Markdownlint (0.37.0)

68-6868: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


140-142: Fix typo in Poe tasks description

There's a small typo in the Poe tasks section:

-The Airbyte CDK uses [Poe the Poet](https://poethepoet.natn.io/) to define common development task.
+The Airbyte CDK uses [Poe the Poet](https://poethepoet.natn.io/) to define common development tasks.
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4bc243d and 62cca5e.

📒 Files selected for processing (1)
  • docs/CONTRIBUTING.md (2 hunks)
🧰 Additional context used
🪛 LanguageTool
docs/CONTRIBUTING.md

[style] ~28-2828: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ... unit tests. - poetry run pytest-fast to run the subset of PyTest tests which ar...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


[style] ~35-3535: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...ting issues. - poetry run ruff format to format your Python code. - `poetry run ...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


[style] ~36-3636: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...hon code. - poetry run poe format-fix to auto-fix formatting issues. ### Run Co...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)

🪛 Markdownlint (0.37.0)
docs/CONTRIBUTING.md

59-5959: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


60-6060: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


61-6161: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


64-6464: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


65-6565: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


66-6666: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


73-7373: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


75-7575: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


76-7676: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


82-8282: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


86-8686: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


106-106106: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


113-113113: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


121-121121: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


128-128128: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


63-6363: null
Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)


68-6868: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


87-8787: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


97-9797: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


107-107107: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🔇 Additional comments (1)
docs/CONTRIBUTING.md (1)

124-126: Fix the command in the example

There seems to be a copy-paste error in the command example. The text mentions running tests, but the command shows build:

-airbyte-ci connectors --use-local-cdk --name=<CONNECTOR> build
+airbyte-ci connectors --use-local-cdk --name=<CONNECTOR> test

* to organize and present the setup steps explicitly
* to avoid misunderstandings and wasted eforts
@rpopov
Copy link
Author

rpopov commented Jan 5, 2025

The reported failed checks are due to #197

@rpopov rpopov changed the title Feat: Publish the response to process in the context of the record's transformation and filtering steps feat: Publish the response to process in the context of the record's transformation and filtering steps Jan 7, 2025
@marcosmarxm
Copy link
Member

marcosmarxm commented Jan 8, 2025

@airbytehq/dev-python Can someone take a look at this contribution?

renamed verify_* function to asser_* what it actually does.
Commended
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
unit_tests/sources/declarative/extractors/test_dpath_extractor.py (1)

196-200: Would you consider adding a quick comment about service keys? 📝

The service key validation and exclusion logic looks good! To help future maintainers, maybe we could add a brief comment explaining why we're doing these steps? Something like:

    actual_records = list(extractor.extract_records(response))

+   # Validate and remove service keys (fields starting with '$') that are only used
+   # during the transformation phase and shouldn't be part of the final output
    for record in actual_records:
        assert_service_keys_exist(record)

    actual_records = [exclude_service_keys(record) for record in actual_records]
unit_tests/sources/declarative/extractors/test_record_selector.py (1)

170-170: Clean handling of service keys! ✨

Good addition of exclude_service_keys before assertions. Would it be valuable to add a test case that explicitly verifies that service keys are properly excluded? This could help catch any regressions in the service key handling. wdyt?

Here's a potential test case to consider adding:

(
    "test_service_keys_are_excluded",
    ["data"],
    None,
    {
        "data": [
            {"id": 1, "$service_field": "hidden", "visible_field": "shown"}
        ]
    },
    [{"id": 1, "visible_field": "shown"}]
),

Also applies to: 244-244

🧰 Tools
🪛 GitHub Actions: Linters

[warning] File requires formatting. Run 'ruff format' to fix code formatting issues.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 62d2417 and 8c636f7.

📒 Files selected for processing (4)
  • airbyte_cdk/sources/declarative/extractors/record_extractor.py (1 hunks)
  • unit_tests/sources/declarative/extractors/test_dpath_extractor.py (5 hunks)
  • unit_tests/sources/declarative/extractors/test_record_extractor.py (1 hunks)
  • unit_tests/sources/declarative/extractors/test_record_selector.py (5 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • airbyte_cdk/sources/declarative/extractors/record_extractor.py
  • unit_tests/sources/declarative/extractors/test_record_extractor.py
🧰 Additional context used
🪛 GitHub Actions: Linters
unit_tests/sources/declarative/extractors/test_dpath_extractor.py

[error] 4-23: Import block is un-sorted or un-formatted

unit_tests/sources/declarative/extractors/test_record_selector.py

[warning] File requires formatting. Run 'ruff format' to fix code formatting issues.

⏰ Context from checks skipped due to timeout of 90000ms (2)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
🔇 Additional comments (5)
unit_tests/sources/declarative/extractors/test_dpath_extractor.py (2)

40-94: Awesome test coverage! 🎯

Really nice job on covering various edge cases and nested structures! The test cases are well-organized and comprehensive.


156-179: Love the descriptive test IDs! 👌

The test identifiers are very clear and help understand the purpose of each test case at a glance.

unit_tests/sources/declarative/extractors/test_record_selector.py (3)

50-79: Great test coverage for nested structures! 🎯

The new test cases thoroughly verify the handling of nested data structures. Have you considered adding a test case where the nested structure is invalid or malformed to verify error handling? wdyt?

🧰 Tools
🪛 GitHub Actions: Linters

[warning] File requires formatting. Run 'ruff format' to fix code formatting issues.


114-126: Nice test for root response access! 💡

Good test case demonstrating access to the original response. Would it be helpful to add a test case where we try to access an invalid path in the root response to verify error handling? wdyt?

🧰 Tools
🪛 GitHub Actions: Linters

[warning] File requires formatting. Run 'ruff format' to fix code formatting issues.


12-19: Run formatter to fix indentation.

The pipeline indicates formatting issues. Would you mind running ruff format to fix the indentation? 🛠️

🧰 Tools
🪛 GitHub Actions: Linters

[warning] File requires formatting. Run 'ruff format' to fix code formatting issues.

rpopov and others added 3 commits January 9, 2025 01:44
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
* main:
  fix(airbyte-cdk): unable to create custom retriever (airbytehq#198)
  feat(low-code): added keys replace transformation (airbytehq#183)
  feat: add `min` macros (airbytehq#203)
  fix(low-code cdk pagination): Fix the offset strategy so that it resets back to 0 when a stream is an incremental data feed (airbytehq#202)
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (10)
unit_tests/sources/declarative/extractors/test_dpath_extractor.py (2)

40-94: Add docstrings to explain test scenarios?

The new test cases are comprehensive and well-structured! Would you consider adding docstrings to explain the purpose of each test scenario? This could help other developers understand the edge cases being tested, wdyt?

For example:

@pytest.mark.parametrize(
    "field_path, decoder, body, expected_records",
    [
        # Test empty input handling
        pytest.param([], decoder_json, b"", [], id="test_extract_from_empty_string"),
        # Test nested structure handling
        pytest.param([], decoder_json, [{"id": 1, "nested": {"id2": 2}}], [{"id": 1, "nested": {"id2": 2}}], id="test_extract_from_nonempty_array_with_nested_array"),
        # ... more test cases
    ]
)

Also applies to: 156-179


196-200: Add explicit test cases for service key handling?

The service key verification and exclusion steps are good additions! Would it be helpful to add test cases that explicitly verify:

  1. The presence of required service keys after extraction
  2. The successful removal of service keys after exclusion

Here's a possible test case structure:

def test_service_key_handling():
    extractor = DpathExtractor(field_path=[], config=config, decoder=decoder_json, parameters=parameters)
    response = create_response({"id": 1})
    records = list(extractor.extract_records(response))
    
    # Verify service keys exist
    assert "$root" in records[0]
    
    # Verify service keys are removed
    cleaned_record = exclude_service_keys(records[0])
    assert "$root" not in cleaned_record
docs/CONTRIBUTING.md (5)

71-72: Consider consolidating duplicate references to Issue #197

Hey there! 👋 I noticed we're referencing the same issue twice in close proximity. Maybe we could combine these into a single reference under the "See also" section? wdyt?

 See also:
 
 - [Dager-Podman Integration](https://blog.playgroundtech.io/introduction-to-dagger-6ab55ee28723)
 - [CDK Issue 197](https://github.com/airbytehq/airbyte-python-cdk/issues/197)
-
-7. Make sure Docker is installed locally, instead of Podman
-   See also:
-
-- [CDK Issue 197](https://github.com/airbytehq/airbyte-python-cdk/issues/197)
+7. Make sure Docker is installed locally, instead of Podman

Also applies to: 77-77


21-23: Add language specifiers to code blocks

Hey! 🚀 Would you mind adding language specifiers to the code blocks? This helps with syntax highlighting and makes the documentation more readable. For example:

-```
+```bash
sudo dnf install python3.11



Also applies to: 29-31, 37-39, 43-45, 49-51, 55-57, 61-67, 81-83, 140-143, 154-157, 189-197, 213-229

<details>
<summary>🧰 Tools</summary>

<details>
<summary>🪛 Markdownlint (0.37.0)</summary>

21-21: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)

</details>

</details>

---

`163-163`: **Consider adding a comma for better readability**

What do you think about adding a comma after "page" in this sentence? 
```diff
-- On the **Settings / Domains** page find the subdomain
++ On the **Settings / Domains** page, find the subdomain
🧰 Tools
🪛 LanguageTool

[uncategorized] ~163-~163: Possible missing comma found.
Context: ...file. - On the Settings / Domains page find the subdomain of myshopify.com a...

(AI_HYDRA_LEO_MISSING_COMMA)


283-283: Consider using American English for consistency

Hey! 👋 The phrase "have a look" is British English. Since the rest of the documentation uses American English, would you consider changing it to "take a look"? wdyt?

-Please have a look at the [Release Management]
+Please take a look at the [Release Management]
🧰 Tools
🪛 LanguageTool

[locale-violation] ~283-~283: In American English, “take a look” is more commonly used.
Context: ...Actions. ## Release Management Please have a look at the [Release Management](./RELEASES....

(HAVE_A_LOOK)


136-136: Consider using proper heading syntax

Instead of using bold text for these section headers, would you consider using proper markdown heading syntax? This would improve the document's structure and make it easier to navigate. For example:

-**Preparation steps**
+### Preparation steps

Also applies to: 235-235

🧰 Tools
🪛 Markdownlint (0.37.0)

136-136: null
Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)

unit_tests/sources/declarative/extractors/test_record_selector.py (3)

50-79: Great test coverage for nested structures! 🎯

The new test cases thoroughly cover nested data scenarios. What do you think about making the test names more Pythonic by using underscores instead of camelCase? For example:

  • test_no_record_filter_returns_all_records_with_nested (already good!)
  • test_true_record_filter_returns_all_records_with_nested (already good!)

114-126: Consider renaming this test case for consistency

What do you think about renaming test_the original response is available in filters and transformations to test_original_response_available_in_filters_and_transformations to match the Python naming convention used in other test cases? wdyt?


170-170: Consider adding explicit service key tests

I see you've added service key exclusion in the assertions. What do you think about adding specific test cases to verify the behavior of service keys? This could include:

  • Verifying that service keys are properly excluded
  • Checking edge cases with missing or malformed service keys

Example test case structure:

@pytest.mark.parametrize(
    "test_name, input_record, expected_output",
    [
        (
            "test_exclude_service_keys",
            {"id": 1, "$root": {"data": []}, "$parameters": {}},
            {"id": 1}
        ),
        # Add more test cases
    ]
)
def test_service_key_handling(test_name, input_record, expected_output):
    assert exclude_service_keys(input_record) == expected_output

Also applies to: 244-244

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8c636f7 and 118d785.

⛔ Files ignored due to path filters (1)
  • poetry.lock is excluded by !**/*.lock
📒 Files selected for processing (4)
  • docs/CONTRIBUTING.md (5 hunks)
  • unit_tests/sources/declarative/extractors/test_dpath_extractor.py (5 hunks)
  • unit_tests/sources/declarative/extractors/test_record_extractor.py (1 hunks)
  • unit_tests/sources/declarative/extractors/test_record_selector.py (5 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • unit_tests/sources/declarative/extractors/test_record_extractor.py
🧰 Additional context used
🪛 LanguageTool
docs/CONTRIBUTING.md

[style] ~96-~96: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ... unit tests. - poetry run pytest-fast to run the subset of PyTest tests, which a...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


[style] ~103-~103: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...ting issues. - poetry run ruff format to format your Python code. - `poetry run ...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


[style] ~104-~104: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...hon code. - poetry run poe format-fix to auto-fix formatting issues. ### Run Co...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


[uncategorized] ~163-~163: Possible missing comma found.
Context: ...file. - On the Settings / Domains page find the subdomain of myshopify.com a...

(AI_HYDRA_LEO_MISSING_COMMA)


[style] ~265-~265: Consider a more expressive alternative.
Context: ...or, you may need to regenerate them. To do that, you can run: ```bash poetry run ...

(DO_ACHIEVE)


[locale-violation] ~283-~283: In American English, “take a look” is more commonly used.
Context: ...Actions. ## Release Management Please have a look at the [Release Management](./RELEASES....

(HAVE_A_LOOK)

🪛 Markdownlint (0.37.0)
docs/CONTRIBUTING.md

136-136: null
Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)


235-235: null
Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)


21-21: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


29-29: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


37-37: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


43-43: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


49-49: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


55-55: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


61-61: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


81-81: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


140-140: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


154-154: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


170-170: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


189-189: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


201-201: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


213-213: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


221-221: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)

⏰ Context from checks skipped due to timeout of 90000ms (2)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
🔇 Additional comments (3)
unit_tests/sources/declarative/extractors/test_dpath_extractor.py (1)

18-21: LGTM! Clean import organization.

The new imports for service key management functions are well-organized and properly placed.

docs/CONTRIBUTING.md (1)

245-246: The command example needs updating

The text mentions testing, but the command shows build. This issue was previously identified in a past review.

unit_tests/sources/declarative/extractors/test_record_selector.py (1)

12-19: LGTM! Clean import organization.

The new imports are well-organized and properly scoped for the added functionality.

Trigger another another build.

The unit tests in this exactly version succeed under Fedora 41, while
one of them fails under Oracle Linux 8.7.
The integration and acceptance tests fail under specific OS and setup,
not related to the contents of this PR.
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
docs/CONTRIBUTING.md (3)

19-56: Would you consider improving the formatting consistency of installation instructions?

The Fedora-specific installation steps could be better organized. Here's a suggestion, wdyt?

  1. Add language specifiers to code blocks (e.g., bash or shell)
  2. Consider using a consistent pattern for OS-specific instructions, like:
2. Make sure your Python version is 3.10 or 3.11

Installation commands:
- Generic:
```bash
# Your generic installation commands
  • Fedora 41:
sudo dnf install python3.11

<details>
<summary>🧰 Tools</summary>

<details>
<summary>🪛 Markdownlint (0.37.0)</summary>

21-21: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)

---

29-29: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)

---

37-37: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)

---

43-43: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)

---

49-49: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)

---

55-55: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)

</details>

</details>

---

`99-102`: **Would you consider reorganizing the test commands for better clarity?**

The test commands could be organized by purpose. How about something like this, wdyt?

```markdown
### Run Unit Tests

Run all tests:
```bash
poetry run pytest

Run specific tests:

poetry run pytest -k <suite or test name>

Run fast tests only (<5 min):

poetry run pytest-fast

<details>
<summary>🧰 Tools</summary>

<details>
<summary>🪛 LanguageTool</summary>

[style] ~101-~101: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ... unit tests. - `poetry run pytest-fast` to run the subset of PyTest tests, which a...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)

</details>

</details>

---

`175-184`: **Consider adding a note about sensitive information in the example**

The example configuration contains placeholders for sensitive information. Would you consider adding a note to remind users not to commit their actual credentials, wdyt?

```diff
 {
   "shop": "nw0ipt-vr",
   "start_date": "2020-11-01",
   "credentials": {
     "auth_method": "api_password",
     "api_password": "shpat_XXXXXXXXXXX"   # Replace with your actual API password
   }
 }
+
+Note: Never commit your actual credentials to version control. Make sure config.json is in .gitignore.
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 118d785 and b49a98d.

📒 Files selected for processing (1)
  • docs/CONTRIBUTING.md (5 hunks)
🧰 Additional context used
🪛 LanguageTool
docs/CONTRIBUTING.md

[style] ~101-~101: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ... unit tests. - poetry run pytest-fast to run the subset of PyTest tests, which a...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


[style] ~108-~108: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...ting issues. - poetry run ruff format to format your Python code. - `poetry run ...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


[style] ~109-~109: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...hon code. - poetry run poe format-fix to auto-fix formatting issues. ### Run Co...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


[style] ~270-~270: Consider a more expressive alternative.
Context: ...or, you may need to regenerate them. To do that, you can run: ```bash poetry run ...

(DO_ACHIEVE)


[locale-violation] ~288-~288: In American English, “take a look” is more commonly used.
Context: ...Actions. ## Release Management Please have a look at the [Release Management](./RELEASES....

(HAVE_A_LOOK)

🪛 Markdownlint (0.37.0)
docs/CONTRIBUTING.md

136-136: null
Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)


235-235: null
Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)


21-21: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


29-29: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


37-37: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


43-43: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


49-49: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


55-55: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


61-61: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


81-81: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


140-140: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


154-154: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


170-170: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


189-189: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


201-201: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


213-213: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


221-221: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🔇 Additional comments (2)
docs/CONTRIBUTING.md (2)

250-251: The command example needs to be fixed

The text mentions running tests, but the command shows build. This issue was previously identified.

-airbyte-ci connectors --use-local-cdk --name=<CONNECTOR> build
+airbyte-ci connectors --use-local-cdk --name=<CONNECTOR> test

Line range hint 270-294: Documentation generation instructions look good!

The instructions for generating API documentation and managing releases are clear and well-organized.

🧰 Tools
🪛 LanguageTool

[locale-violation] ~288-~288: In American English, “take a look” is more commonly used.
Context: ...Actions. ## Release Management Please have a look at the [Release Management](./RELEASES....

(HAVE_A_LOOK)

Trigger another build, hoping the infrastructure will not fail it
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
docs/CONTRIBUTING.md (2)

21-23: Would you consider adding language specifiers to code blocks? 🤔

Adding language specifiers (e.g., bash, json) would enable syntax highlighting and improve readability. For example:

-```
+```bash
sudo dnf install python3.11

Also applies to: 29-31, 37-39, 43-45, 49-51, 54-56, 60-62, 66-72, 86-88, 144-147, 158-161, 174-183, 193-201, 205-213, 217-221, 225-233

🧰 Tools
🪛 Markdownlint (0.37.0)

21-21: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


287-287: Small style suggestions to enhance readability 📝

Would you consider these minor improvements?

  • Line 287: In American English, "take a look" is more commonly used than "have a look"
  • Lines 101, 108: Consider varying the sentence beginnings to avoid repetition

Just nitpicks to make the docs even better, wdyt? 😊

Also applies to: 101-101, 108-108

🧰 Tools
🪛 LanguageTool

[locale-violation] ~287-~287: In American English, “take a look” is more commonly used.
Context: ...Actions. ## Release Management Please have a look at the [Release Management](./RELEASES....

(HAVE_A_LOOK)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b49a98d and f7f19be.

📒 Files selected for processing (1)
  • docs/CONTRIBUTING.md (5 hunks)
🧰 Additional context used
🪛 LanguageTool
docs/CONTRIBUTING.md

[style] ~101-~101: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ... unit tests. - poetry run pytest-fast to run the subset of PyTest tests, which a...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


[style] ~108-~108: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...ting issues. - poetry run ruff format to format your Python code. ### Run Code ...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


[style] ~269-~269: Consider a more expressive alternative.
Context: ...or, you may need to regenerate them. To do that, you can run: ```bash poetry run ...

(DO_ACHIEVE)


[locale-violation] ~287-~287: In American English, “take a look” is more commonly used.
Context: ...Actions. ## Release Management Please have a look at the [Release Management](./RELEASES....

(HAVE_A_LOOK)

🪛 Markdownlint (0.37.0)
docs/CONTRIBUTING.md

140-140: null
Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)


239-239: null
Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)


21-21: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


29-29: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


37-37: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


43-43: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


49-49: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


54-54: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


60-60: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


66-66: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


86-86: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


144-144: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


158-158: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


174-174: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


193-193: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


205-205: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


217-217: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


225-225: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)

⏰ Context from checks skipped due to timeout of 90000ms (3)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Pytest (Fast)
🔇 Additional comments (2)
docs/CONTRIBUTING.md (2)

8-11: Great job on improving the documentation structure! 🎉

The installation instructions are now much clearer with OS-specific commands and detailed steps. This will help contributors get started more easily.

Also applies to: 17-23, 25-56


249-250: The command example needs to be updated.

The text mentions running tests, but the command shows build.

-airbyte-ci connectors --use-local-cdk --name=<CONNECTOR> build
+airbyte-ci connectors --use-local-cdk --name=<CONNECTOR> test

Handled nitpick comments by rabbit
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (4)
docs/CONTRIBUTING.md (4)

8-8: Enhance clarity of contribution types

The line mentions "bug fixes" but doesn't specify the scope. Would it be helpful to clarify the types of bug fixes (e.g., runtime bugs, documentation fixes, test fixes)? wdyt?

-You can contribute by reporting bugs, posting GitHub discussions, opening issues, improving docs, and submitting pull requests with bug fixes and new features.
+You can contribute by reporting bugs, posting GitHub discussions, opening issues, improving docs, and submitting pull requests with bug fixes (runtime, documentation, or test fixes) and new features.

19-23: Consider adding version verification command

Would it be helpful to add a command to verify the Python version after installation? wdyt?

sudo dnf install python3.11
+
+# Verify installation
+python3.11 --version

99-102: Consider restructuring test commands for better readability

The test commands could be more readable with additional context. What do you think about adding descriptions and examples? wdyt?

-poetry run pytest` to run all unit tests.
-poetry run pytest -k <suite or test name>` to run specific unit tests.
-poetry run pytest-fast` to run the subset of PyTest tests, which are not flagged as `slow`. (It should take <5 min for fast tests only.)
+# Run all unit tests
+poetry run pytest
+
+# Run specific test suite or test (e.g., 'test_record_extractor')
+poetry run pytest -k <suite or test name>
+
+# Run fast tests only (takes <5 min)
+poetry run pytest-fast
🧰 Tools
🪛 LanguageTool

[style] ~101-~101: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ... unit tests. - poetry run pytest-fast to run the subset of PyTest tests, which a...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


167-171: Fix formatting in the configuration example

The configuration example has inconsistent formatting. Consider using a code block for better readability:

-    domain: nw0ipt-vr.myshopify.com
-    shop: nw0ipt-vr
+```yaml
+domain: nw0ipt-vr.myshopify.com
+shop: nw0ipt-vr
+```
🧰 Tools
🪛 LanguageTool

[uncategorized] ~167-~167: Possible missing comma found.
Context: ...file. - On the Settings / Domains page find the subdomain of myshopify.com a...

(AI_HYDRA_LEO_MISSING_COMMA)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f7f19be and e5b9e5c.

📒 Files selected for processing (1)
  • docs/CONTRIBUTING.md (5 hunks)
🧰 Additional context used
🪛 LanguageTool
docs/CONTRIBUTING.md

[style] ~101-~101: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ... unit tests. - poetry run pytest-fast to run the subset of PyTest tests, which a...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


[style] ~108-~108: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...ting issues. - poetry run ruff format to format your Python code. ### Run Code ...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


[uncategorized] ~137-~137: You might be missing the article “the” here.
Context: ...om/airbytehq/airbyte) is checked out in airbyte directory. - The [CDK developme...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)


[uncategorized] ~167-~167: Possible missing comma found.
Context: ...file. - On the Settings / Domains page find the subdomain of myshopify.com a...

(AI_HYDRA_LEO_MISSING_COMMA)


[style] ~269-~269: Consider a more expressive alternative.
Context: ...or, you may need to regenerate them. To do that, you can run: ```bash poetry run ...

(DO_ACHIEVE)


[locale-violation] ~287-~287: In American English, “take a look” is more commonly used.
Context: ...Actions. ## Release Management Please have a look at the [Release Management](./RELEASES....

(HAVE_A_LOOK)

🪛 Markdownlint (0.37.0)
docs/CONTRIBUTING.md

140-140: null
Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)


239-239: null
Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)


86-86: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)

⏰ Context from checks skipped due to timeout of 90000ms (7)
  • GitHub Check: Check: 'source-pokeapi' (skip=false)
  • GitHub Check: Check: 'source-the-guardian-api' (skip=false)
  • GitHub Check: Check: 'source-shopify' (skip=false)
  • GitHub Check: Check: 'source-hardcoded-records' (skip=false)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (Fast)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
🔇 Additional comments (1)
docs/CONTRIBUTING.md (1)

249-250: Fix the command in the example

There seems to be a copy-paste error in the command example. The text mentions running tests, but the command shows build:

-airbyte-ci connectors --use-local-cdk --name=<CONNECTOR> build
+airbyte-ci connectors --use-local-cdk --name=<CONNECTOR> test

docs/CONTRIBUTING.md Show resolved Hide resolved
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
unit_tests/sources/declarative/test_manifest_declarative_source.py (2)

1771-1771: Consider adding a descriptive assertion message?

The assertion could be more helpful during test failures if we add a descriptive message. What do you think about:

-        assert output_data == expected_records
+        assert output_data == expected_records, f"Expected {expected_records} but got {output_data}"

Line range hint 1772-2051: Great test coverage for caching behavior! 🎉

The test thoroughly verifies the caching behavior between parent and child streams. The assertions effectively check both the stream creation and caching configuration.

Consider extracting common configurations to fixtures?

The test setup contains repeated stream configurations. What do you think about extracting common parts into fixtures to improve maintainability? For example:

@pytest.fixture
def base_stream_config():
    return {
        "paginator": {
            "type": "DefaultPaginator",
            "page_size": 10,
            "page_size_option": {
                "type": "RequestOption",
                "inject_into": "request_parameter",
                "field_name": "per_page",
            },
            # ... other common config
        }
    }
airbyte_cdk/sources/declarative/extractors/record_selector.py (1)

162-167: Consider adding a docstring for clarity.

The method looks good, but would you consider adding a docstring to explain what service keys are and why we're removing them? Something like:

def _remove_service_keys(
    self, records: Iterable[Mapping[str, Any]]
) -> Iterable[Mapping[str, Any]]:
    """
    Remove service keys (fields starting with '$') from each record.
    These keys are used during transformation but should not be part of the final output.

    Args:
        records: Iterator of records potentially containing service keys

    Returns:
        Iterator of records with service keys removed
    """

wdyt?

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e5b9e5c and 638ed41.

📒 Files selected for processing (4)
  • airbyte_cdk/sources/declarative/extractors/record_selector.py (3 hunks)
  • airbyte_cdk/sources/declarative/requesters/paginators/strategies/cursor_pagination_strategy.py (1 hunks)
  • unit_tests/sources/declarative/retrievers/test_simple_retriever.py (2 hunks)
  • unit_tests/sources/declarative/test_manifest_declarative_source.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • airbyte_cdk/sources/declarative/requesters/paginators/strategies/cursor_pagination_strategy.py
  • unit_tests/sources/declarative/retrievers/test_simple_retriever.py
⏰ Context from checks skipped due to timeout of 90000ms (7)
  • GitHub Check: Check: 'source-pokeapi' (skip=false)
  • GitHub Check: Check: 'source-the-guardian-api' (skip=false)
  • GitHub Check: Check: 'source-shopify' (skip=false)
  • GitHub Check: Check: 'source-hardcoded-records' (skip=false)
  • GitHub Check: Pytest (Fast)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
🔇 Additional comments (2)
airbyte_cdk/sources/declarative/extractors/record_selector.py (2)

11-14: LGTM! Clean import addition.

The import of exclude_service_keys is well-placed alongside related imports from the same module.


112-113: Verify the sequence of operations.

The sequence looks good: filter -> transform -> remove service keys -> normalize. However, would it be more efficient to remove service keys during the transformation phase instead of having a separate pass through the data? wdyt?

Let's verify if any transformations depend on service keys:

✅ Verification successful

The separate service key removal step is necessary

The current sequence is correct. Service keys are actively used during transformations (e.g., in FlattenFields), so removing them earlier would break existing functionality. The separate pass keeps the code clean and maintainable by following single responsibility principle.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for any transformations that might use service keys (fields starting with $)
rg -A 5 'transform.*\$' ./airbyte_cdk/sources/declarative/transformations/

Length of output: 76


Script:

#!/bin/bash
# Find all transformation implementations
echo "=== Transformation Implementations ==="
ast-grep --pattern 'class $_ implements TransformationInterface'

echo -e "\n=== Service Key Patterns ==="
# Look for service key patterns more broadly
rg -g '*.py' -A 5 'service.*key|SERVICE.*KEY' ./airbyte_cdk/sources/declarative/

Length of output: 7629

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (6)
airbyte_cdk/sources/declarative/decoders/decoder.py (1)

29-30: Great docstring improvement! Would you consider adding more details? 🤔

Thanks for clarifying the empty response behavior! This is super helpful. Since we're improving the docs, would you consider also adding examples of what constitutes an "empty response" (e.g., null, empty string, etc.) and maybe mention any special handling for malformed responses? wdyt?

 Decodes a requests.Response into a Mapping[str, Any] or an array.
-The empty response produces an empty list.
+The empty response (e.g., null, empty string, no content) produces an empty list.
+Malformed responses will raise a DecodeError with details about the failure.
 :param response: the response to decode
 :return: Generator of Mapping describing the response
unit_tests/sources/declarative/decoders/test_json_decoder.py (2)

11-18: Hey there! 👋 I noticed some unused imports.

The following imports aren't being used in this file:

  • YamlDeclarativeSource
  • DeclarativeStreamModel
  • ModelToComponentFactory

Would you like to remove them to keep the imports clean? wdyt? 🤔

-from airbyte_cdk import YamlDeclarativeSource
from airbyte_cdk.models import SyncMode
from airbyte_cdk.sources.declarative.decoders import GzipJsonDecoder
from airbyte_cdk.sources.declarative.decoders.json_decoder import JsonDecoder, JsonlDecoder
-from airbyte_cdk.sources.declarative.models import DeclarativeStream as DeclarativeStreamModel
-from airbyte_cdk.sources.declarative.parsers.model_to_component_factory import (
-    ModelToComponentFactory,
-)
🧰 Tools
🪛 GitHub Actions: Linters

[warning] File needs formatting. Run 'ruff format' to fix formatting issues.


Line range hint 1-150: Quick formatting note 🎨

The linter reports some formatting issues. Could you run ruff format to fix them? This will ensure consistent formatting across the codebase.

🧰 Tools
🪛 GitHub Actions: Linters

[warning] File needs formatting. Run 'ruff format' to fix formatting issues.

unit_tests/sources/declarative/extractors/test_dpath_extractor.py (3)

40-42: Consider adding error scenario tests?

The empty input test cases look good! Would you consider adding some error scenarios too? For example:

  • Malformed JSON input
  • Invalid field paths
  • Unexpected data types

What do you think? 🤔

Also applies to: 126-129


40-40: Could we clarify the JSONL contract comment?

The comment # The JSON contract is irregular, compare with JSONL might be a bit cryptic for future readers. Would you mind expanding on what makes it irregular and why it's important to compare with JSONL? Maybe something like:

# JSON decoder returns [{}] for empty input while JSONL returns [], 
# this difference is by design due to their distinct parsing contracts

wdyt? 🤓


196-201: Enhance empty record handling?

The service key handling looks solid! A couple of small suggestions to make it even better:

  1. The comment # A valid JSON parsed, see the contract could be more descriptive. Maybe:
# Only validate service keys for non-empty records as per the extractor contract
  1. Would it make sense to define an EMPTY_RECORD = {} constant at the module level? This would make the empty record check more maintainable:
if record != EMPTY_RECORD:

What are your thoughts on these suggestions? 🎯

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 638ed41 and ca9026a.

📒 Files selected for processing (5)
  • airbyte_cdk/sources/declarative/decoders/decoder.py (1 hunks)
  • airbyte_cdk/sources/declarative/decoders/json_decoder.py (1 hunks)
  • airbyte_cdk/sources/declarative/extractors/dpath_extractor.py (2 hunks)
  • unit_tests/sources/declarative/decoders/test_json_decoder.py (1 hunks)
  • unit_tests/sources/declarative/extractors/test_dpath_extractor.py (5 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • airbyte_cdk/sources/declarative/extractors/dpath_extractor.py
  • airbyte_cdk/sources/declarative/decoders/json_decoder.py
🧰 Additional context used
🪛 GitHub Actions: Linters
unit_tests/sources/declarative/decoders/test_json_decoder.py

[warning] File needs formatting. Run 'ruff format' to fix formatting issues.

⏰ Context from checks skipped due to timeout of 90000ms (7)
  • GitHub Check: Check: 'source-pokeapi' (skip=false)
  • GitHub Check: Check: 'source-the-guardian-api' (skip=false)
  • GitHub Check: Check: 'source-shopify' (skip=false)
  • GitHub Check: Check: 'source-hardcoded-records' (skip=false)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Pytest (Fast)
🔇 Additional comments (3)
unit_tests/sources/declarative/decoders/test_json_decoder.py (2)

22-27: Could we add more context about the JSON contract? 🤔

I see you've added a comment about the JSON contract being irregular, but it might help future maintainers to understand:

  1. Why empty string returns [{}] instead of an empty list
  2. What makes the contract irregular
  3. Where this behavior is documented

Would you mind adding a more detailed docstring or comment explaining these points? wdyt?

🧰 Tools
🪛 GitHub Actions: Linters

[warning] File needs formatting. Run 'ruff format' to fix formatting issues.


48-54: Love the improved test case IDs! 🎉

The descriptive IDs make the test cases much more readable and maintainable. Nice job on adding edge cases for empty objects and lists too!

🧰 Tools
🪛 GitHub Actions: Linters

[warning] File needs formatting. Run 'ruff format' to fix formatting issues.

unit_tests/sources/declarative/extractors/test_dpath_extractor.py (1)

18-21: LGTM! Clean import organization.

The new imports for service key management functions are well-organized and properly placed.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
unit_tests/sources/declarative/test_manifest_declarative_source.py (3)

137-139: Would you like to enhance the docstring with more details?

The current docstring could be more helpful if it included:

  • Purpose of the dynamic declarative stream configuration
  • Parameters it accepts from _declarative_stream
  • Example usage in tests
    wdyt?
-        """
-        Generates a DynamicDeclarativeStream configuration.
-        """
+        """
+        Generates a DynamicDeclarativeStream configuration for testing.
+        
+        This fixture builds upon the base declarative stream configuration from _declarative_stream
+        and adds dynamic resolution capabilities through a components resolver.
+        
+        Returns:
+            dict: A configuration dictionary for a DynamicDeclarativeStream with template-based
+                 field resolution.
+        
+        Example:
+            def test_dynamic_stream(self, _dynamic_declarative_stream):
+                manifest = {"streams": [_dynamic_declarative_stream]}
+                # Test implementation
+        """

1773-1773: Fix indentation for better readability?

The line appears to be incorrectly indented. Should we align it with the rest of the function body?

+        # Assert the output matches expected records
+        assert output_data == expected_records

Line range hint 402-425: Consider adding type hints to helper functions?

Adding type hints to _create_request, _create_response, and _create_page would improve code maintainability and IDE support. wdyt?

-def _create_request():
+def _create_request() -> requests.PreparedRequest:

-def _create_response(body):
+def _create_response(body: dict) -> requests.Response:

-def _create_page(response_body):
+def _create_page(response_body: dict) -> requests.Response:
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 04e12a3 and 8d0c705.

📒 Files selected for processing (1)
  • unit_tests/sources/declarative/test_manifest_declarative_source.py (2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (7)
  • GitHub Check: Check: 'source-pokeapi' (skip=false)
  • GitHub Check: Check: 'source-the-guardian-api' (skip=false)
  • GitHub Check: Check: 'source-shopify' (skip=false)
  • GitHub Check: Check: 'source-hardcoded-records' (skip=false)
  • GitHub Check: Pytest (Fast)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
🔇 Additional comments (1)
unit_tests/sources/declarative/test_manifest_declarative_source.py (1)

Line range hint 1-2000: Great test coverage! 👍

The test suite is well-structured with comprehensive coverage of:

  • Manifest validation
  • Pagination scenarios
  • Partition routing
  • Data transformations
  • Error cases

@marcosmarxm
Copy link
Member

@airbytehq/dev-python can someone take a look at this contribution?

@rpopov
Copy link
Author

rpopov commented Jan 13, 2025

The squashed commits are in a new PR: #214

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants