Skip to content

Commit 42ef58e

Browse files
andygrovealamb
andauthored
docs: Update DataFusion introduction to clarify that DataFusion does provide an "out of the box" query engine (#12666)
* Update DataFusion introduction to show that DataFusion offers packaged versions for end users * change order * Update README.md Co-authored-by: Andrew Lamb <[email protected]> * refine wording and update user guide for consistency * prettier --------- Co-authored-by: Andrew Lamb <[email protected]>
1 parent 1f2f02f commit 42ef58e

File tree

2 files changed

+28
-4
lines changed

2 files changed

+28
-4
lines changed

README.md

Lines changed: 14 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -42,14 +42,25 @@
4242
</a>
4343

4444
DataFusion is an extensible query engine written in [Rust] that
45-
uses [Apache Arrow] as its in-memory format. DataFusion's target users are
45+
uses [Apache Arrow] as its in-memory format.
46+
47+
The DataFusion libraries in this repository are used to build data-centric system software. DataFusion also provides the
48+
following subprojects, which are packaged versions of DataFusion intended for end users.
49+
50+
- [DataFusion Python](https://github.com/apache/datafusion-python/) offers a Python interface for SQL and DataFrame
51+
queries.
52+
- [DataFusion Ray](https://github.com/apache/datafusion-ray/) provides a distributed version of DataFusion that scales
53+
out on Ray clusters.
54+
- [DataFusion Comet](https://github.com/apache/datafusion-comet/) is an accelerator for Apache Spark based on
55+
DataFusion.
56+
57+
The target audience for the DataFusion crates in this repository are
4658
developers building fast and feature rich database and analytic systems,
4759
customized to particular workloads. See [use cases] for examples.
4860

49-
"Out of the box," DataFusion offers [SQL] and [`Dataframe`] APIs,
61+
DataFusion offers [SQL] and [`Dataframe`] APIs,
5062
excellent [performance], built-in support for CSV, Parquet, JSON, and Avro,
5163
extensive customization, and a great community.
52-
[Python Bindings] are also available.
5364

5465
DataFusion features a full query planner, a columnar, streaming, multi-threaded,
5566
vectorized execution engine, and partitioned data sources. You can

docs/source/index.rst

Lines changed: 14 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,20 @@ Apache DataFusion
3434

3535

3636
DataFusion is an extensible query engine written in `Rust <http://rustlang.org>`_ that
37-
uses `Apache Arrow <https://arrow.apache.org>`_ as its in-memory format. DataFusion's target users are
37+
uses `Apache Arrow <https://arrow.apache.org>`_ as its in-memory format.
38+
39+
This documentation is for the <a href="https://github.com/apache/datafusion">core DataFusion project</a>, which contains
40+
libraries that are used to build data-centric system software. DataFusion also offers the following subprojects, which
41+
provide packaged versions of DataFusion intended for end users, and these have separate documentation.
42+
43+
- <a href="https://datafusion.apache.org/python/">DataFusion Python</a> offers a Python interface for SQL and DataFrame
44+
queries.
45+
- <a href="https://github.com/apache/datafusion-ray/">DataFusion Ray</a> provides a distributed version of DataFusion
46+
that scales out on <a href="https://www.ray.io">Ray</a> clusters.
47+
- <a href="https://datafusion.apache.org/comet/">DataFusion Comet</a> is an accelerator for Apache Spark based on
48+
DataFusion.
49+
50+
DataFusion's target users are
3851
developers building fast and feature rich database and analytic systems,
3952
customized to particular workloads. See `use cases <https://datafusion.apache.org/user-guide/introduction.html#use-cases>`_ for examples.
4053

0 commit comments

Comments
 (0)