Moved to datafusion-contrib

Datafusion-Bigtable

Bigtable data source for Apache Arrow Datafusion

Run SQL on Bigtable

This crate implements Bigtable data source and Executor for Datafusion. It is built on top of gRPC client tonic.

Quick Start

let bigtable_datasource = BigtableDataSource::new(
    "emulator".to_owned(),                               // project
    "dev".to_owned(),                                    // instance
    "weather_balloons".to_owned(),                       // table
    "measurements".to_owned(),                           // column family
    vec!["_row_key".to_owned()],                         // table_partition_cols
    vec![Field::new("pressure", DataType::Utf8, false)], // qualifiers
    true,                                                // only_read_latest
).await.unwrap();

let mut ctx = ExecutionContext::new();
ctx.register_table("weather_balloons", Arc::new(bigtable_datasource)).unwrap();

ctx.sql("SELECT \"_row_key\", pressure, \"_timestamp\" FROM weather_balloons where \"_row_key\" = 'us-west2#3698#2021-03-05-1200'").await?.collect().await?;

Roadmap

SQL

✅ select by "_row_key" =
✅ select by "_row_key" IN
✅ select by "_row_key" BETWEEN
select by composite row keys (via table_partition_cols and table_partition_separator)
Projection pushdown
Predicate push down (Value range)
Limit Pushdown

General

Multi Thread or Partition aware execution
Production ready Bigtable SDK in Rust

Note: datafusion-bigtable provides the physical Executor for Datafusion. Any aggregation, group by, join are implemented and handled by Datafusion.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Moved to datafusion-contrib

Datafusion-Bigtable

Run SQL on Bigtable

Quick Start

Roadmap

SQL

General

Files

README.md

Latest commit

History

README.md

File metadata and controls

Moved to datafusion-contrib

Datafusion-Bigtable

Run SQL on Bigtable

Quick Start

Roadmap

SQL

General