diff --git a/docs/releases/1_20_2.md b/docs/releases/1_20_2.md new file mode 100644 index 0000000000..bc29744cdc --- /dev/null +++ b/docs/releases/1_20_2.md @@ -0,0 +1,50 @@ +--- +title: 1.20.2 +sidebar_position: 9937 +--- + +# 1.20.1 - 2024-08-21 + +### Added +* **Python: add `CompositeTransport`** [`#2925`](https://github.com/OpenLineage/OpenLineage/pull/2925) [@JDarDagran](https://github.com/JDarDagran) + *Adds a `CompositeTransport` that can accept other transport configs to instantiate transports and use them to emit events.* +* **Spark: compile & test Spark integration on Java 17** [`#2828`](https://github.com/OpenLineage/OpenLineage/pull/2828) [@pawel-big-lebowski](https://github.com/pawel-big-lebowski) + *The Spark integration is always compiled with Java 17, while tests are running on both Java 8 and Java 17 according to the configuration.* +* **Spark: support preview release of Spark 4.0** [`#2854`](https://github.com/OpenLineage/OpenLineage/pull/2854) [@pawel-big-lebowski](https://github.com/pawel-big-lebowski) + *Includes the Spark 4.0 preview release in the integration tests.* +* **Spark: add handling for `Window`** [`#2901`](https://github.com/OpenLineage/OpenLineage/pull/2901) [@tnazarew](https://github.com/tnazarew) + *Adds handling for `Window`-type nodes of a logical plan.* +* **Spark: extract and send events with raw SQL from Spark** [`#2913`](https://github.com/OpenLineage/OpenLineage/pull/2913) [@Imbruced](https://github.com/Imbruced) + *Adds a parser that traverses `QueryExecution` to get the SQL query used from the SQL field with a BFS algorithm.* +* **Spark: support Mongostream source** [`#2887`](https://github.com/OpenLineage/OpenLineage/pull/2887) [@Imbruced](https://github.com/Imbruced) + *Adds a Mongo streaming visitor and tests.* +* **Spark: new mechanism for disabling facets** [`#2912`](https://github.com/OpenLineage/OpenLineage/pull/2912) [@arturowczarek](https://github.com/arturowczarek) + *The mechanism makes `FacetConfig` accept the disabled flag for any facet instead of passing them as a list.* +* **Spark: support Kinesis source** [`#2906`](https://github.com/OpenLineage/OpenLineage/pull/2906) [@Imbruced](https://github.com/Imbruced) + *Adds a Kinesis class handler in the streaming source builder.* +* **Spark: extract `DatasetIdentifier` from extension `LineageNode`** [`#2900`](https://github.com/OpenLineage/OpenLineage/pull/2900) [@ddebowczyk92](https://github.com/ddebowczyk92) + *Adds support for cases in which `LogicalRelation` has a grandChild node that implements the `LineageRelation` interface.* +* **Spark: extract Dataset from underlying `BaseRelation`** [`#2893`](https://github.com/OpenLineage/OpenLineage/pull/2893) [@ddebowczyk92](https://github.com/ddebowczyk92) + *`DatasetIdentifier` is now extracted from the underlying node of `LogicalRelation`.* +* **Spark: add descriptions and Marquez UI to Docker Compose file** [`#2889`](https://github.com/OpenLineage/OpenLineage/pull/2889) [@jonathanlbt1](https://github.com/jonathanlbt1) + *Adds the `marquez-web` service to docker-compose.yml.* + +### Fixed +* **Proxy: bug fixed on error messages descriptions** [`#2880`](https://github.com/OpenLineage/OpenLineage/pull/2880) [@jonathanlbt1](https://github.com/jonathanlbt1) + *Improves error logging.* +* **Proxy: update Docker image for Fluentd 1.17** [`#2877`](https://github.com/OpenLineage/OpenLineage/pull/2877) [@jonathanlbt1](https://github.com/jonathanlbt1) + *Upgrades the Fluentd version.* +* **Spark: fix issue with Kafka source when saving with `for each` batch method** [`#2868`](https://github.com/OpenLineage/OpenLineage/pull/2868) [@imbruced](https://github.com/Imbruced) + *Fixes an issue when Spark is in streaming mode and input for Kafka was not present in the event.* +* **Spark: properly set ARN in namespace for Iceberg Glue symlinks** [`#2943`](https://github.com/OpenLineage/OpenLineage/pull/2943) [@arturowczarek](https://github.com/arturowczarek) + *Makes `IcebergHandler` support Glue catalog tables and create the symlink using the code from `PathUtils`.* +* **Spark: accept any provider for AWS Glue storage format** [`#2917`](https://github.com/OpenLineage/OpenLineage/pull/2917) [@arturowczarek](https://github.com/arturowczarek) + *Makes the AWS Glue ARN generating method accept every format (including Parquet), not only Hive SerDe.* +* **Spark: return valid JSON for failed logical plan serialization** [`#2892`](https://github.com/OpenLineage/OpenLineage/pull/2892) [@arturowczarek](https://github.com/arturowczarek) + *The `LogicalPlanSerializer` now returns `` for failed serialization instead of an empty string.* +* **Spark: extract legacy column lineage visitors loader** [`#2883`](https://github.com/OpenLineage/OpenLineage/pull/2883) [@arturowczarek](https://github.com/arturowczarek) + *Refactors `CustomCollectorsUtils` for improved readability.* +* **Spark: add Kafka input source when writing in `foreach` batch mode** [`#2868`](https://github.com/OpenLineage/OpenLineage/pull/2868) [@Imbruced](https://github.com/Imbruced) + *Fixes a bug keeping Kafka input sources from being produced.* +* **Spark: extract `DatasetIdentifier` from `SaveIntoDataSourceCommandVisitor` options** [`#2934`](https://github.com/OpenLineage/OpenLineage/pull/2934) [@ddebowczyk92](https://github.com/ddebowczyk92) + *Extracts `DatasetIdentifier` from command's options instead of relying on `p.createRelation(sqlContext, command.options())`, which is a heavy operation for `JdbcRelationProvider`.*