Update DataFusion introduction to show that DataFusion offers package…

…d versions for end users
apache · Sep 28, 2024 · 2b1e183 · 2b1e183
1 parent f87db21
commit 2b1e183
Showing 1 changed file with 14 additions and 3 deletions.
diff --git a/README.md b/README.md
@@ -42,14 +42,25 @@
 </a>
 
 DataFusion is an extensible query engine written in [Rust] that
-uses [Apache Arrow] as its in-memory format. DataFusion's target users are
+uses [Apache Arrow] as its in-memory format.
+
+The core DataFusion libraries in this repository are not designed to be an out-of-the-box tool for end users. The
+following subprojects offer packaged versions of DataFusion.
+
+- [DataFusion Python](https://github.com/apache/datafusion-python/) offers a Python interface for SQL and DataFrame
+  queries.
+- [DataFusion Comet](https://github.com/apache/datafusion-comet/) is an accelerator for Apache Spark based on
+  DataFusion.
+- [DataFusion Ray](https://github.com/apache/datafusion-ray/) provides a distributed version of DataFusion that scales
+  out on Ray clusters.
+
+The target audience for the DataFusion crates in this repository are
 developers building fast and feature rich database and analytic systems,
 customized to particular workloads. See [use cases] for examples.
 
-"Out of the box," DataFusion offers [SQL] and [`Dataframe`] APIs,
+DataFusion offers [SQL] and [`Dataframe`] APIs,
 excellent [performance], built-in support for CSV, Parquet, JSON, and Avro,
 extensive customization, and a great community.
-[Python Bindings] are also available.
 
 DataFusion features a full query planner, a columnar, streaming, multi-threaded,
 vectorized execution engine, and partitioned data sources. You can