Skip to content

Commit cbe50e1

Browse files
authored
chore: Add changelog for 0.5.0 (#1278)
* Add changelog * revert accidental change * move 2 items to performance section
1 parent 9fe5420 commit cbe50e1

File tree

1 file changed

+129
-0
lines changed

1 file changed

+129
-0
lines changed

dev/changelog/0.5.0.md

+129
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,129 @@
1+
<!--
2+
Licensed to the Apache Software Foundation (ASF) under one
3+
or more contributor license agreements. See the NOTICE file
4+
distributed with this work for additional information
5+
regarding copyright ownership. The ASF licenses this file
6+
to you under the Apache License, Version 2.0 (the
7+
"License"); you may not use this file except in compliance
8+
with the License. You may obtain a copy of the License at
9+
10+
http://www.apache.org/licenses/LICENSE-2.0
11+
12+
Unless required by applicable law or agreed to in writing,
13+
software distributed under the License is distributed on an
14+
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
15+
KIND, either express or implied. See the License for the
16+
specific language governing permissions and limitations
17+
under the License.
18+
-->
19+
20+
# DataFusion Comet 0.5.0 Changelog
21+
22+
This release consists of 69 commits from 15 contributors. See credits at the end of this changelog for more information.
23+
24+
**Fixed bugs:**
25+
26+
- fix: Unsigned type related bugs [#1095](https://github.com/apache/datafusion-comet/pull/1095) (kazuyukitanimura)
27+
- fix: Use RDD partition index [#1112](https://github.com/apache/datafusion-comet/pull/1112) (viirya)
28+
- fix: Various metrics bug fixes and improvements [#1111](https://github.com/apache/datafusion-comet/pull/1111) (andygrove)
29+
- fix: Don't create CometScanExec for subclasses of ParquetFileFormat [#1129](https://github.com/apache/datafusion-comet/pull/1129) (Kimahriman)
30+
- fix: Fix metrics regressions [#1132](https://github.com/apache/datafusion-comet/pull/1132) (andygrove)
31+
- fix: Enable scenarios accidentally commented out in CometExecBenchmark [#1151](https://github.com/apache/datafusion-comet/pull/1151) (mbutrovich)
32+
- fix: Spark 4.0-preview1 SPARK-47120 [#1156](https://github.com/apache/datafusion-comet/pull/1156) (kazuyukitanimura)
33+
- fix: Document enabling comet explain plan usage in Spark (4.0) [#1176](https://github.com/apache/datafusion-comet/pull/1176) (parthchandra)
34+
- fix: stddev_pop should not directly return 0.0 when count is 1.0 [#1184](https://github.com/apache/datafusion-comet/pull/1184) (viirya)
35+
- fix: fix missing explanation for then branch in case when [#1200](https://github.com/apache/datafusion-comet/pull/1200) (rluvaton)
36+
- fix: Fall back to Spark for unsupported partition or sort expressions in window aggregates [#1253](https://github.com/apache/datafusion-comet/pull/1253) (andygrove)
37+
- fix: Fall back to Spark for distinct aggregates [#1262](https://github.com/apache/datafusion-comet/pull/1262) (andygrove)
38+
- fix: disable initCap by default [#1276](https://github.com/apache/datafusion-comet/pull/1276) (kazuyukitanimura)
39+
40+
**Performance related:**
41+
42+
- perf: Stop passing Java config map into native createPlan [#1101](https://github.com/apache/datafusion-comet/pull/1101) (andygrove)
43+
- feat: Make native shuffle compression configurable and respect `spark.shuffle.compress` [#1185](https://github.com/apache/datafusion-comet/pull/1185) (andygrove)
44+
- perf: Improve query planning to more reliably fall back to columnar shuffle when native shuffle is not supported [#1209](https://github.com/apache/datafusion-comet/pull/1209) (andygrove)
45+
- feat: Move shuffle block decompression and decoding to native code and add LZ4 & Snappy support [#1192](https://github.com/apache/datafusion-comet/pull/1192) (andygrove)
46+
- feat: Implement custom RecordBatch serde for shuffle for improved performance [#1190](https://github.com/apache/datafusion-comet/pull/1190) (andygrove)
47+
48+
**Implemented enhancements:**
49+
50+
- feat: support array_insert [#1073](https://github.com/apache/datafusion-comet/pull/1073) (SemyonSinchenko)
51+
- feat: enable decimal to decimal cast of different precision and scale [#1086](https://github.com/apache/datafusion-comet/pull/1086) (himadripal)
52+
- feat: Improve ScanExec native metrics [#1133](https://github.com/apache/datafusion-comet/pull/1133) (andygrove)
53+
- feat: Add Spark-compatible implementation of SchemaAdapterFactory [#1169](https://github.com/apache/datafusion-comet/pull/1169) (andygrove)
54+
- feat: Improve shuffle metrics (second attempt) [#1175](https://github.com/apache/datafusion-comet/pull/1175) (andygrove)
55+
- feat: Add a `spark.comet.exec.memoryPool` configuration for experimenting with various datafusion memory pool setups. [#1021](https://github.com/apache/datafusion-comet/pull/1021) (Kontinuation)
56+
- feat: Reenable tests for filtered SMJ anti join [#1211](https://github.com/apache/datafusion-comet/pull/1211) (comphead)
57+
- feat: add support for array_remove expression [#1179](https://github.com/apache/datafusion-comet/pull/1179) (jatin510)
58+
59+
**Documentation updates:**
60+
61+
- docs: Update documentation for 0.4.0 release [#1096](https://github.com/apache/datafusion-comet/pull/1096) (andygrove)
62+
- docs: Fix readme typo FGPA -> FPGA [#1117](https://github.com/apache/datafusion-comet/pull/1117) (gstvg)
63+
- docs: Add more technical detail and new diagram to Comet plugin overview [#1119](https://github.com/apache/datafusion-comet/pull/1119) (andygrove)
64+
- docs: Add some documentation explaining how shuffle works [#1148](https://github.com/apache/datafusion-comet/pull/1148) (andygrove)
65+
- docs: Update TPC-H benchmark results [#1257](https://github.com/apache/datafusion-comet/pull/1257) (andygrove)
66+
67+
**Other:**
68+
69+
- chore: Add changelog for 0.4.0 [#1089](https://github.com/apache/datafusion-comet/pull/1089) (andygrove)
70+
- chore: Prepare for 0.5.0 development [#1090](https://github.com/apache/datafusion-comet/pull/1090) (andygrove)
71+
- build: Skip installation of spark-integration and fuzz testing modules [#1091](https://github.com/apache/datafusion-comet/pull/1091) (parthchandra)
72+
- minor: Add hint for finding the GPG key to use when publishing to maven [#1093](https://github.com/apache/datafusion-comet/pull/1093) (andygrove)
73+
- chore: Include first ScanExec batch in metrics [#1105](https://github.com/apache/datafusion-comet/pull/1105) (andygrove)
74+
- chore: Improve CometScan metrics [#1100](https://github.com/apache/datafusion-comet/pull/1100) (andygrove)
75+
- chore: Add custom metric for native shuffle fetching batches from JVM [#1108](https://github.com/apache/datafusion-comet/pull/1108) (andygrove)
76+
- chore: Remove unused StringView struct [#1143](https://github.com/apache/datafusion-comet/pull/1143) (andygrove)
77+
- test: enable more Spark 4.0 tests [#1145](https://github.com/apache/datafusion-comet/pull/1145) (kazuyukitanimura)
78+
- chore: Refactor cast to use SparkCastOptions param [#1146](https://github.com/apache/datafusion-comet/pull/1146) (andygrove)
79+
- chore: Move more expressions from core crate to spark-expr crate [#1152](https://github.com/apache/datafusion-comet/pull/1152) (andygrove)
80+
- chore: Remove dead code [#1155](https://github.com/apache/datafusion-comet/pull/1155) (andygrove)
81+
- chore: Move string kernels and expressions to spark-expr crate [#1164](https://github.com/apache/datafusion-comet/pull/1164) (andygrove)
82+
- chore: Move remaining expressions to spark-expr crate + some minor refactoring [#1165](https://github.com/apache/datafusion-comet/pull/1165) (andygrove)
83+
- chore: Add ignored tests for reading complex types from Parquet [#1167](https://github.com/apache/datafusion-comet/pull/1167) (andygrove)
84+
- test: enabling Spark tests with offHeap requirement [#1177](https://github.com/apache/datafusion-comet/pull/1177) (kazuyukitanimura)
85+
- minor: move shuffle classes from common to spark [#1193](https://github.com/apache/datafusion-comet/pull/1193) (andygrove)
86+
- minor: refactor to move decodeBatches to broadcast exchange code as private function [#1195](https://github.com/apache/datafusion-comet/pull/1195) (andygrove)
87+
- minor: refactor prepare_output so that it does not require an ExecutionContext [#1194](https://github.com/apache/datafusion-comet/pull/1194) (andygrove)
88+
- minor: remove unused source files [#1202](https://github.com/apache/datafusion-comet/pull/1202) (andygrove)
89+
- chore: Upgrade to DataFusion 44.0.0-rc2 [#1154](https://github.com/apache/datafusion-comet/pull/1154) (andygrove)
90+
- chore: Add safety check to CometBuffer [#1050](https://github.com/apache/datafusion-comet/pull/1050) (viirya)
91+
- chore: Remove unreachable code [#1213](https://github.com/apache/datafusion-comet/pull/1213) (andygrove)
92+
- test: Enable Comet by default except some tests in SparkSessionExtensionSuite [#1201](https://github.com/apache/datafusion-comet/pull/1201) (kazuyukitanimura)
93+
- chore: extract `struct` expressions to folders based on spark grouping [#1216](https://github.com/apache/datafusion-comet/pull/1216) (rluvaton)
94+
- chore: extract static invoke expressions to folders based on spark grouping [#1217](https://github.com/apache/datafusion-comet/pull/1217) (rluvaton)
95+
- chore: Follow-on PR to fully enable onheap memory usage [#1210](https://github.com/apache/datafusion-comet/pull/1210) (andygrove)
96+
- chore: extract agg_funcs expressions to folders based on spark grouping [#1224](https://github.com/apache/datafusion-comet/pull/1224) (rluvaton)
97+
- chore: extract datetime_funcs expressions to folders based on spark grouping [#1222](https://github.com/apache/datafusion-comet/pull/1222) (rluvaton)
98+
- chore: Upgrade to DataFusion 44.0.0 from 44.0.0 RC2 [#1232](https://github.com/apache/datafusion-comet/pull/1232) (rluvaton)
99+
- chore: extract strings file to `strings_func` like in spark grouping [#1215](https://github.com/apache/datafusion-comet/pull/1215) (rluvaton)
100+
- chore: extract predicate_functions expressions to folders based on spark grouping [#1218](https://github.com/apache/datafusion-comet/pull/1218) (rluvaton)
101+
- build(deps): bump protobuf version to 3.21.12 [#1234](https://github.com/apache/datafusion-comet/pull/1234) (wForget)
102+
- chore: extract json_funcs expressions to folders based on spark grouping [#1220](https://github.com/apache/datafusion-comet/pull/1220) (rluvaton)
103+
- test: Enable shuffle by default in Spark tests [#1240](https://github.com/apache/datafusion-comet/pull/1240) (kazuyukitanimura)
104+
- chore: extract hash_funcs expressions to folders based on spark grouping [#1221](https://github.com/apache/datafusion-comet/pull/1221) (rluvaton)
105+
- build: Fix test failure caused by merging conflicting PRs [#1259](https://github.com/apache/datafusion-comet/pull/1259) (andygrove)
106+
107+
## Credits
108+
109+
Thank you to everyone who contributed to this release. Here is a breakdown of commits (PRs merged) per contributor.
110+
111+
```
112+
37 Andy Grove
113+
10 Raz Luvaton
114+
7 KAZUYUKI TANIMURA
115+
3 Liang-Chi Hsieh
116+
2 Parth Chandra
117+
1 Adam Binford
118+
1 Dharan Aditya
119+
1 Himadri Pal
120+
1 Jagdish Parihar
121+
1 Kristin Cowalcijk
122+
1 Matt Butrovich
123+
1 Oleks V
124+
1 Sem
125+
1 Zhen Wang
126+
1 gstvg
127+
```
128+
129+
Thank you also to everyone who contributed in other ways such as filing issues, reviewing PRs, and providing feedback on this release.

0 commit comments

Comments
 (0)