Skip to content

Commit 5daf96f

Browse files
committed
Merge branch 'master' into interleave-bloom
2 parents a417f01 + 087f34b commit 5daf96f

File tree

215 files changed

+16638
-8858
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

215 files changed

+16638
-8858
lines changed

.github/workflows/docs.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -89,7 +89,7 @@ jobs:
8989
rm website/build/artifact.tar
9090
cp .asf.yaml ./website/build/.asf.yaml
9191
- name: Deploy to gh-pages
92-
uses: peaceiris/actions-gh-pages@v3.9.3
92+
uses: peaceiris/actions-gh-pages@v4.0.0
9393
if: github.event_name == 'push' && github.ref_name == 'master'
9494
with:
9595
github_token: ${{ secrets.GITHUB_TOKEN }}

.github/workflows/integration.yml

Lines changed: 13 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -57,15 +57,17 @@ jobs:
5757
env:
5858
ARROW_USE_CCACHE: OFF
5959
ARROW_CPP_EXE_PATH: /build/cpp/debug
60+
ARROW_NANOARROW_PATH: /build/nanoarrow
6061
ARROW_RUST_EXE_PATH: /build/rust/debug
6162
BUILD_DOCS_CPP: OFF
6263
ARROW_INTEGRATION_CPP: ON
6364
ARROW_INTEGRATION_CSHARP: ON
6465
ARROW_INTEGRATION_GO: ON
6566
ARROW_INTEGRATION_JAVA: ON
6667
ARROW_INTEGRATION_JS: ON
68+
ARCHERY_INTEGRATION_WITH_NANOARROW: "1"
6769
# https://github.com/apache/arrow/pull/38403/files#r1371281630
68-
ARCHERY_INTEGRATION_WITH_RUST: ON
70+
ARCHERY_INTEGRATION_WITH_RUST: "1"
6971
# These are necessary because the github runner overrides $HOME
7072
# https://github.com/actions/runner/issues/863
7173
RUSTUP_HOME: /root/.rustup
@@ -95,6 +97,16 @@ jobs:
9597
with:
9698
path: rust
9799
fetch-depth: 0
100+
- name: Checkout Arrow nanoarrow
101+
uses: actions/checkout@v4
102+
with:
103+
repository: apache/arrow-nanoarrow
104+
path: nanoarrow
105+
fetch-depth: 0
106+
# Workaround https://github.com/rust-lang/rust/issues/125067
107+
- name: Downgrade rust
108+
working-directory: rust
109+
run: rustup override set 1.77
98110
- name: Build
99111
run: conda run --no-capture-output ci/scripts/integration_arrow_build.sh $PWD /build
100112
- name: Run

.github/workflows/object_store.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -138,7 +138,7 @@ jobs:
138138

139139
- name: Setup LocalStack (AWS emulation)
140140
run: |
141-
echo "LOCALSTACK_CONTAINER=$(docker run -d -p 4566:4566 localstack/localstack:3.2.0)" >> $GITHUB_ENV
141+
echo "LOCALSTACK_CONTAINER=$(docker run -d -p 4566:4566 localstack/localstack:3.3.0)" >> $GITHUB_ENV
142142
echo "EC2_METADATA_CONTAINER=$(docker run -d -p 1338:1338 amazon/amazon-ec2-metadata-mock:v1.9.2 --imdsv2)" >> $GITHUB_ENV
143143
aws --endpoint-url=http://localhost:4566 s3 mb s3://test-bucket
144144
aws --endpoint-url=http://localhost:4566 dynamodb create-table --table-name test-table --key-schema AttributeName=path,KeyType=HASH AttributeName=etag,KeyType=RANGE --attribute-definitions AttributeName=path,AttributeType=S AttributeName=etag,AttributeType=S --provisioned-throughput ReadCapacityUnits=5,WriteCapacityUnits=5

CHANGELOG-old.md

Lines changed: 137 additions & 0 deletions
Large diffs are not rendered by default.

CHANGELOG.md

Lines changed: 163 additions & 118 deletions
Large diffs are not rendered by default.

Cargo.toml

Lines changed: 16 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -62,7 +62,7 @@ exclude = [
6262
]
6363

6464
[workspace.package]
65-
version = "51.0.0"
65+
version = "52.0.0"
6666
homepage = "https://github.com/apache/arrow-rs"
6767
repository = "https://github.com/apache/arrow-rs"
6868
authors = ["Apache Arrow <[email protected]>"]
@@ -77,20 +77,20 @@ edition = "2021"
7777
rust-version = "1.62"
7878

7979
[workspace.dependencies]
80-
arrow = { version = "51.0.0", path = "./arrow", default-features = false }
81-
arrow-arith = { version = "51.0.0", path = "./arrow-arith" }
82-
arrow-array = { version = "51.0.0", path = "./arrow-array" }
83-
arrow-buffer = { version = "51.0.0", path = "./arrow-buffer" }
84-
arrow-cast = { version = "51.0.0", path = "./arrow-cast" }
85-
arrow-csv = { version = "51.0.0", path = "./arrow-csv" }
86-
arrow-data = { version = "51.0.0", path = "./arrow-data" }
87-
arrow-ipc = { version = "51.0.0", path = "./arrow-ipc" }
88-
arrow-json = { version = "51.0.0", path = "./arrow-json" }
89-
arrow-ord = { version = "51.0.0", path = "./arrow-ord" }
90-
arrow-row = { version = "51.0.0", path = "./arrow-row" }
91-
arrow-schema = { version = "51.0.0", path = "./arrow-schema" }
92-
arrow-select = { version = "51.0.0", path = "./arrow-select" }
93-
arrow-string = { version = "51.0.0", path = "./arrow-string" }
94-
parquet = { version = "51.0.0", path = "./parquet", default-features = false }
80+
arrow = { version = "52.0.0", path = "./arrow", default-features = false }
81+
arrow-arith = { version = "52.0.0", path = "./arrow-arith" }
82+
arrow-array = { version = "52.0.0", path = "./arrow-array" }
83+
arrow-buffer = { version = "52.0.0", path = "./arrow-buffer" }
84+
arrow-cast = { version = "52.0.0", path = "./arrow-cast" }
85+
arrow-csv = { version = "52.0.0", path = "./arrow-csv" }
86+
arrow-data = { version = "52.0.0", path = "./arrow-data" }
87+
arrow-ipc = { version = "52.0.0", path = "./arrow-ipc" }
88+
arrow-json = { version = "52.0.0", path = "./arrow-json" }
89+
arrow-ord = { version = "52.0.0", path = "./arrow-ord" }
90+
arrow-row = { version = "52.0.0", path = "./arrow-row" }
91+
arrow-schema = { version = "52.0.0", path = "./arrow-schema" }
92+
arrow-select = { version = "52.0.0", path = "./arrow-select" }
93+
arrow-string = { version = "52.0.0", path = "./arrow-string" }
94+
parquet = { version = "52.0.0", path = "./parquet", default-features = false }
9595

9696
chrono = { version = "0.4.34", default-features = false, features = ["clock"] }

README.md

Lines changed: 73 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -17,41 +17,94 @@
1717
under the License.
1818
-->
1919

20-
# Native Rust implementation of Apache Arrow and Parquet
20+
# Native Rust implementation of Apache Arrow and Apache Parquet
2121

2222
[![Coverage Status](https://codecov.io/gh/apache/arrow-rs/rust/branch/master/graph/badge.svg)](https://codecov.io/gh/apache/arrow-rs?branch=master)
2323

24-
Welcome to the implementation of Arrow, the popular in-memory columnar format, in [Rust][rust].
24+
Welcome to the [Rust][rust] implementation of [Apache Arrow], the popular in-memory columnar format.
2525

2626
This repo contains the following main components:
2727

28-
| Crate | Description | Latest API Docs | README |
29-
| ------------ | ------------------------------------------------------------------------- | ---------------------------------------------- | ------------------------------ |
30-
| arrow | Core functionality (memory layout, arrays, low level computations) | [docs.rs](https://docs.rs/arrow/latest) | [(README)][arrow-readme] |
31-
| parquet | Support for Parquet columnar file format | [docs.rs](https://docs.rs/parquet/latest) | [(README)][parquet-readme] |
32-
| arrow-flight | Support for Arrow-Flight IPC protocol | [docs.rs](https://docs.rs/arrow-flight/latest) | [(README)][flight-readme] |
33-
| object-store | Support for object store interactions (aws, azure, gcp, local, in-memory) | [docs.rs](https://docs.rs/object_store/latest) | [(README)][objectstore-readme] |
28+
| Crate | Description | Latest API Docs | README |
29+
| ------------------ | ---------------------------------------------------------------------------- | ------------------------------------------------ | --------------------------------- |
30+
| [`arrow`] | Core functionality (memory layout, arrays, low level computations) | [docs.rs](https://docs.rs/arrow/latest) | [(README)][arrow-readme] |
31+
| [`arrow-flight`] | Support for Arrow-Flight IPC protocol | [docs.rs](https://docs.rs/arrow-flight/latest) | [(README)][flight-readme] |
32+
| [`object-store`] | Support for object store interactions (aws, azure, gcp, local, in-memory) | [docs.rs](https://docs.rs/object_store/latest) | [(README)][objectstore-readme] |
33+
| [`parquet`] | Support for Parquet columnar file format | [docs.rs](https://docs.rs/parquet/latest) | [(README)][parquet-readme] |
34+
| [`parquet_derive`] | A crate for deriving RecordWriter/RecordReader for arbitrary, simple structs | [docs.rs](https://docs.rs/parquet-derive/latest) | [(README)][parquet-derive-readme] |
3435

3536
The current development version the API documentation in this repo can be found [here](https://arrow.apache.org/rust).
3637

37-
There are two related crates in a different repository
38+
[apache arrow]: https://arrow.apache.org/
39+
[`arrow`]: https://crates.io/crates/arrow
40+
[`parquet`]: https://crates.io/crates/parquet
41+
[`parquet_derive`]: https://crates.io/crates/parquet-derive
42+
[`arrow-flight`]: https://crates.io/crates/arrow-flight
43+
[`object-store`]: https://crates.io/crates/object-store
3844

39-
| Crate | Description | Documentation |
40-
| ---------- | --------------------------------------- | ----------------------------- |
41-
| DataFusion | In-memory query engine with SQL support | [(README)][datafusion-readme] |
42-
| Ballista | Distributed query execution | [(README)][ballista-readme] |
45+
## Release Versioning and Schedule
4346

44-
Collectively, these crates support a vast array of functionality for analytic computations in Rust.
47+
### `arrow` and `parquet` crates
4548

46-
For example, you can write an SQL query or a `DataFrame` (using the `datafusion` crate), run it against a parquet file (using the `parquet` crate), evaluate it in-memory using Arrow's columnar format (using the `arrow` crate), and send to another process (using the `arrow-flight` crate).
49+
The Arrow Rust project releases approximately monthly and follows [Semantic
50+
Versioning].
4751

48-
Generally speaking, the `arrow` crate offers functionality for using Arrow arrays, and `datafusion` offers most operations typically found in SQL, including `join`s and window functions.
52+
Due to available maintainer and testing bandwidth, [`arrow`] crates ([`arrow`],
53+
[`arrow-flight`], etc.) are released on the same schedule with the same versions
54+
as the [`parquet`] and [`parquet-derive`] crates.
55+
56+
Starting June 2024, we plan to release new major versions with potentially
57+
breaking API changes at most once a quarter, and release incremental minor versions in
58+
the intervening months. See [this ticket] for more details.
59+
60+
For example:
61+
62+
| Approximate Date | Version | Notes |
63+
| ---------------- | -------- | --------------------------------------- |
64+
| Jun 2024 | `52.0.0` | Major, potentially breaking API changes |
65+
| Jul 2024 | `52.1.0` | Minor, NO breaking API changes |
66+
| Aug 2024 | `52.2.0` | Minor, NO breaking API changes |
67+
| Sep 2024 | `53.0.0` | Major, potentially breaking API changes |
68+
69+
[this ticket]: https://github.com/apache/arrow-rs/issues/5368
70+
[semantic versioning]: https://semver.org/
71+
72+
### `object_store` crate
73+
74+
The [`object_store`] crate is released independently of the `arrow` and
75+
`parquet` crates and follows [Semantic Versioning]. We aim to release new
76+
versions approximately every 2 months.
77+
78+
[`object_store`]: https://crates.io/crates/object_store
79+
80+
## Related Projects
81+
82+
There are two related crates in different repositories
83+
84+
| Crate | Description | Documentation |
85+
| -------------- | --------------------------------------- | ----------------------------- |
86+
| [`datafusion`] | In-memory query engine with SQL support | [(README)][datafusion-readme] |
87+
| [`ballista`] | Distributed query execution | [(README)][ballista-readme] |
88+
89+
[`datafusion`]: https://crates.io/crates/datafusion
90+
[`ballista`]: https://crates.io/crates/ballista
91+
92+
Collectively, these crates support a wider array of functionality for analytic computations in Rust.
93+
94+
For example, you can write SQL queries or a `DataFrame` (using the
95+
[`datafusion`] crate) to read a parquet file (using the [`parquet`] crate),
96+
evaluate it in-memory using Arrow's columnar format (using the [`arrow`] crate),
97+
and send to another process (using the [`arrow-flight`] crate).
98+
99+
Generally speaking, the [`arrow`] crate offers functionality for using Arrow
100+
arrays, and [`datafusion`] offers most operations typically found in SQL,
101+
including `join`s and window functions.
49102

50103
You can find more details about each crate in their respective READMEs.
51104

52105
## Arrow Rust Community
53106

54-
The `[email protected]` mailing list serves as the core communication channel for the Arrow community. Instructions for signing up and links to the archives can be found at the [Arrow Community](https://arrow.apache.org/community/) page. All major announcements and communications happen there.
107+
The `[email protected]` mailing list serves as the core communication channel for the Arrow community. Instructions for signing up and links to the archives can be found on the [Arrow Community](https://arrow.apache.org/community/) page. All major announcements and communications happen there.
55108

56109
The Rust Arrow community also uses the official [ASF Slack](https://s.apache.org/slack-invite) for informal discussions and coordination. This is
57110
a great place to meet other contributors and get guidance on where to contribute. Join us in the `#arrow-rust` channel and feel free to ask for an invite via:
@@ -72,8 +125,9 @@ There is more information in the [contributing] guide.
72125
[contributing]: CONTRIBUTING.md
73126
[parquet-readme]: parquet/README.md
74127
[flight-readme]: arrow-flight/README.md
75-
[datafusion-readme]: https://github.com/apache/arrow-datafusion/blob/main/README.md
76-
[ballista-readme]: https://github.com/apache/arrow-ballista/blob/main/README.md
128+
[datafusion-readme]: https://github.com/apache/datafusion/blob/main/README.md
129+
[ballista-readme]: https://github.com/apache/datafusion-ballista/blob/main/README.md
77130
[objectstore-readme]: object_store/README.md
131+
[parquet-derive-readme]: parquet_derive/README.md
78132
[issues]: https://github.com/apache/arrow-rs/issues
79133
[discussions]: https://github.com/apache/arrow-rs/discussions

0 commit comments

Comments
 (0)