Skip to content

Commit

Permalink
Improve comments on target user and unify summaries (#12418)
Browse files Browse the repository at this point in the history
  • Loading branch information
alamb authored Sep 12, 2024
1 parent 1f06308 commit 97ad0ad
Show file tree
Hide file tree
Showing 3 changed files with 53 additions and 21 deletions.
25 changes: 22 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,9 +41,28 @@
<img src="./docs/source/_static/images/2x_bgwhite_original.png" width="512" alt="logo"/>
</a>

Apache DataFusion is a very fast, extensible query engine for building high-quality data-centric systems in
[Rust](http://rustlang.org), using the [Apache Arrow](https://arrow.apache.org)
in-memory format. [Python Bindings](https://github.com/apache/datafusion-python) are also available. DataFusion offers SQL and Dataframe APIs, excellent [performance](https://benchmark.clickhouse.com/), built-in support for CSV, Parquet, JSON, and Avro, extensive customization, and a great community.
DataFusion is an extensible query engine written in [Rust] that
uses [Apache Arrow] as its in-memory format. DataFusion's target users are
developers building fast and feature rich database and analytic systems,
customized to particular workloads. See [use cases] for examples.

"Out of the box," DataFusion offers [SQL] and [`Dataframe`] APIs,
excellent [performance], built-in support for CSV, Parquet, JSON, and Avro,
extensive customization, and a great community.
[Python Bindings] are also available.

DataFusion features a full query planner, a columnar, streaming, multi-threaded,
vectorized execution engine, and partitioned data sources. You can
customize DataFusion at almost all points including additional data sources,
query languages, functions, custom operators and more.
See the [Architecture] section for more details.

[rust]: http://rustlang.org
[apache arrow]: https://arrow.apache.org
[use cases]: https://datafusion.apache.org/user-guide/introduction.html#use-cases
[python bindings]: https://github.com/apache/datafusion-python
[performance]: https://benchmark.clickhouse.com/
[architecture]: https://datafusion.apache.org/contributor-guide/architecture.html

Here are links to some important information

Expand Down
24 changes: 14 additions & 10 deletions datafusion/core/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -17,24 +17,28 @@
#![warn(missing_docs, clippy::needless_borrow)]

//! [DataFusion] is an extensible query engine written in Rust that
//! uses [Apache Arrow] as its in-memory format. DataFusion help developers
//! build fast and feature rich database and analytic systems, customized to
//! particular workloads. See [use cases] for examples
//! uses [Apache Arrow] as its in-memory format. DataFusion's target users are
//! developers building fast and feature rich database and analytic systems,
//! customized to particular workloads. See [use cases] for examples.
//!
//! "Out of the box," DataFusion quickly runs complex [SQL] and
//! [`DataFrame`] queries using a full-featured query planner, a columnar,
//! streaming, multi-threaded, vectorized execution engine, and partitioned data
//! sources (Parquet, CSV, JSON, and Avro).
//! "Out of the box," DataFusion offers [SQL] and [`Dataframe`] APIs,
//! excellent [performance], built-in support for CSV, Parquet, JSON, and Avro,
//! extensive customization, and a great community.
//! [Python Bindings] are also available.
//!
//! DataFusion is designed for easy customization such as
//! additional data sources, query languages, functions, custom
//! operators and more. See the [Architecture] section for more details.
//! DataFusion features a full query planner, a columnar, streaming, multi-threaded,
//! vectorized execution engine, and partitioned data sources. You can
//! customize DataFusion at almost all points including additional data sources,
//! query languages, functions, custom operators and more.
//! See the [Architecture] section below for more details.
//!
//! [DataFusion]: https://datafusion.apache.org/
//! [Apache Arrow]: https://arrow.apache.org
//! [use cases]: https://datafusion.apache.org/user-guide/introduction.html#use-cases
//! [SQL]: https://datafusion.apache.org/user-guide/sql/index.html
//! [`DataFrame`]: dataframe::DataFrame
//! [performance]: https://benchmark.clickhouse.com/
//! [Python Bindings]: https://github.com/apache/datafusion-python
//! [Architecture]: #architecture
//!
//! # Examples
Expand Down
25 changes: 17 additions & 8 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -32,14 +32,23 @@ Apache DataFusion
<a class="github-button" href="https://github.com/apache/datafusion/fork" data-size="large" data-show-count="true" aria-label="Fork apache/datafusion on GitHub">Fork</a>
</p>

DataFusion is a very fast, extensible query engine for building high-quality data-centric systems in
`Rust <http://rustlang.org>`_, using the `Apache Arrow <https://arrow.apache.org>`_
in-memory format.

DataFusion offers SQL and Dataframe APIs, excellent
`performance <https://benchmark.clickhouse.com>`_, built-in support for
CSV, Parquet, JSON, and Avro, extensive customization, and a great
community.

DataFusion is an extensible query engine written in `Rust <http://rustlang.org>`_ that
uses `Apache Arrow <https://arrow.apache.org>`_ as its in-memory format. DataFusion's target users are
developers building fast and feature rich database and analytic systems,
customized to particular workloads. See `use cases <https://datafusion.apache.org/user-guide/introduction.html#use-cases>`_ for examples.

"Out of the box," DataFusion offers `SQL <https://datafusion.apache.org/user-guide/sql/index.html>`_
and `Dataframe <https://docs.rs/datafusion/latest/datafusion/dataframe/struct.DataFrame.html>`_ APIs,
excellent `performance <https://benchmark.clickhouse.com/>`_, built-in support for CSV, Parquet, JSON, and Avro,
extensive customization, and a great community.
`Python Bindings <https://github.com/apache/datafusion-python>`_ are also available.

DataFusion features a full query planner, a columnar, streaming, multi-threaded,
vectorized execution engine, and partitioned data sources. You can
customize DataFusion at almost all points including additional data sources,
query languages, functions, custom operators and more.
See the `Architecture <https://datafusion.apache.org/contributor-guide/architecture.html>`_ section for more details.

To get started, see

Expand Down

0 comments on commit 97ad0ad

Please sign in to comment.