From b8378570281cd743ca5ec4508ddf3fb785bbe100 Mon Sep 17 00:00:00 2001 From: Gabor Szarnyas Date: Thu, 21 Sep 2023 09:51:50 +0200 Subject: [PATCH 1/7] Rework extension pages --- _data/menu_docs_dev.json | 4 + css/main.scss | 52 ++++++------- docs/api/wasm/extensions.md | 24 +++--- docs/extensions/autocomplete.md | 6 +- docs/extensions/full_text_search.md | 2 +- docs/extensions/iceberg.md | 9 ++- docs/extensions/official_extensions.md | 26 +++++++ docs/extensions/overview.md | 29 +------ docs/extensions/postgres_scanner.md | 28 +++---- docs/extensions/spatial.md | 103 +++++++++++++------------ docs/extensions/sqlite_scanner.md | 10 ++- docs/extensions/substrait.md | 8 +- 12 files changed, 159 insertions(+), 142 deletions(-) create mode 100644 docs/extensions/official_extensions.md diff --git a/_data/menu_docs_dev.json b/_data/menu_docs_dev.json index 6bf4d1872df..980d1704bcb 100644 --- a/_data/menu_docs_dev.json +++ b/_data/menu_docs_dev.json @@ -929,6 +929,10 @@ "page": "Overview", "url": "overview" }, + { + "page": "Official Extensions", + "url": "official_extensions" + }, { "page": "Working with Extensions", "url": "working_with_extensions" diff --git a/css/main.scss b/css/main.scss index b34d634dcff..acfde2dee4a 100644 --- a/css/main.scss +++ b/css/main.scss @@ -1024,6 +1024,32 @@ body.documentation{ nav.mobile{ display: none; } + span.github{ + vertical-align: 1px; + display: inline-block; + background: #D9D9D9; + height: 17px; + line-height: 17px; + padding: 0 5px; + border-radius: 50px; + font-size: 10px; + color: black; + margin-left: 2px; + font-family: "SuisseIntl-Medium"; + transition: background .2s; + &::after{ + content: ""; + background-image: url("data:image/svg+xml,%3Csvg width='10' height='10' viewBox='0 0 10 10' fill='none' xmlns='http://www.w3.org/2000/svg'%3E%3Cpath d='m4.625 2.373.75-.751a2.123 2.123 0 1 1 3.003 3.003l-.75.75M5.374 7.627l-.75.751a2.123 2.123 0 0 1-3.003-3.003l.75-.75m.751 2.252 3.754-3.754' stroke='%23000' stroke-linecap='round' stroke-linejoin='round'/%3E%3C/svg%3E"); + display: inline-block; + width: 10px; + height: 10px; + margin-left: 5px; + vertical-align: -1px; + } + &:hover{ + background: #78A6FF; + } + } main .wrap{ width: calc(100% - 252px); margin-left: 252px; @@ -1043,32 +1069,6 @@ body.documentation{ margin-bottom: -4px; margin-top: 0px; }} - table span.git{ - vertical-align: 1px; - display: inline-block; - background: #D9D9D9; - height: 17px; - line-height: 17px; - padding: 0 5px; - border-radius: 50px; - font-size: 10px; - color: black; - margin-left: 2px; - font-family: "SuisseIntl-Medium"; - transition: background .2s; - &::after{ - content: ""; - background-image: url("data:image/svg+xml,%3Csvg width='10' height='10' viewBox='0 0 10 10' fill='none' xmlns='http://www.w3.org/2000/svg'%3E%3Cpath d='m4.625 2.373.75-.751a2.123 2.123 0 1 1 3.003 3.003l-.75.75M5.374 7.627l-.75.751a2.123 2.123 0 0 1-3.003-3.003l.75-.75m.751 2.252 3.754-3.754' stroke='%23000' stroke-linecap='round' stroke-linejoin='round'/%3E%3C/svg%3E"); - display: inline-block; - width: 10px; - height: 10px; - margin-left: 5px; - vertical-align: -1px; - } - &:hover{ - background: #78A6FF; - } - } /*#all-available-extensions td > a:first-of-type{ margin-right: 8px; }*/ diff --git a/docs/api/wasm/extensions.md b/docs/api/wasm/extensions.md index 74758f0842f..2203ac6f29f 100644 --- a/docs/api/wasm/extensions.md +++ b/docs/api/wasm/extensions.md @@ -26,18 +26,18 @@ Autoloading, so the possibility for DuckDB to add extension functionality on-the | Extension name | Description | Aliases | |---|-----|--| -| autocomplete | Adds support for autocomplete in the shell | | -| [excel](../../extensions/excel) | Adds support for Excel-like format strings | | -| [fts](../../extensions/full_text_search) | Adds support for Full-Text Search Indexes | | -| icu | Adds support for time zones and collations using the ICU library | | -| inet | Adds support for IP-related data types and functions | | -| [json](../../extensions/json) | Adds support for JSON operations | | -| parquet | Adds support for reading and writing parquet files | | -| [sqlite_scanner](../../extensions/sqlite_scanner) [GitHub](https://github.com/duckdblabs/sqlite_scanner) | Adds support for reading SQLite database files | sqlite, sqlite3 | -| sqlsmit | | | -| [substrait](../../extensions/substrait) [GitHub](https://github.com/duckdblabs/substrait) | Adds support for the Substrait integration | | -| tpcds | Adds TPC-DS data generation and query support | | -| tpch | Adds TPC-H data generation and query support | | +| autocomplete | Adds support for autocomplete in the shell | | +| [excel](../../extensions/excel) | Adds support for Excel-like format strings | | +| [fts](../../extensions/full_text_search) | Adds support for Full-Text Search Indexes | | +| icu | Adds support for time zones and collations using the ICU library | | +| inet | Adds support for IP-related data types and functions | | +| [json](../../extensions/json) | Adds support for JSON operations | | +| parquet | Adds support for reading and writing parquet files | | +| [sqlite_scanner](../../extensions/sqlite_scanner) [GitHub](https://github.com/duckdblabs/sqlite_scanner) | Adds support for reading SQLite database files | sqlite, sqlite3 | +| sqlsmit | | | +| [substrait](../../extensions/substrait) [GitHub](https://github.com/duckdblabs/substrait) | Adds support for the Substrait integration | | +| tpcds | Adds TPC-DS data generation and query support | | +| tpch | Adds TPC-H data generation and query support | | WebAssembly is basically an additional platform, and there might be platform specific limitations that make some extensions not able to match their native capabilities or to perform them in a different way. We will document here relevant differences for DuckDB-hosted extensions. diff --git a/docs/extensions/autocomplete.md b/docs/extensions/autocomplete.md index 586f42999bb..6285abe7d16 100644 --- a/docs/extensions/autocomplete.md +++ b/docs/extensions/autocomplete.md @@ -2,14 +2,14 @@ layout: docu title: AutoComplete --- + This extension adds supports for autocomplete. | Function | Description | |:----------------------------------------|:-------------------------------------------------------| | `sql_auto_complete(`*`query_string`*`)` | Attempts autocompletion on the given *`query_string`*. | - -## Example: +## Example ```sql SELECT * FROM sql_auto_complete('SEL'); @@ -39,5 +39,3 @@ Returns: | DEALLOCATE | 0 | | UPDATE | 0 | | DROP | 0 | - - diff --git a/docs/extensions/full_text_search.md b/docs/extensions/full_text_search.md index 3cff6d54141..a5be6524c02 100644 --- a/docs/extensions/full_text_search.md +++ b/docs/extensions/full_text_search.md @@ -2,6 +2,7 @@ layout: docu title: Full Text Search --- + Full Text Search is an extension to DuckDB that allows for search through strings, similar to SQLite's FTS5 extension. ## API @@ -70,7 +71,6 @@ Reduces words to their base. Used internally by the extension. |input_string|`VARCHAR`|The column or constant to be stemmed| |stemmer|`VARCHAR`|The type of stemmer to be used. One of `'arabic'`, `'basque'`, `'catalan'`, `'danish'`, `'dutch'`, `'english'`, `'finnish'`, `'french'`, `'german'`, `'greek'`, `'hindi'`, `'hungarian'`, `'indonesian'`, `'irish'`, `'italian'`, `'lithuanian'`, `'nepali'`, `'norwegian'`, `'porter'`, `'portuguese'`, `'romanian'`, `'russian'`, `'serbian'`, `'spanish'`, `'swedish'`, `'tamil'`, `'turkish'`, or `'none'` if no stemming is to be used.| - ## Example Usage ```sql diff --git a/docs/extensions/iceberg.md b/docs/extensions/iceberg.md index 31ee9ac5178..43b4024b220 100644 --- a/docs/extensions/iceberg.md +++ b/docs/extensions/iceberg.md @@ -2,6 +2,11 @@ layout: docu title: Iceberg --- -The [__iceberg__ extension](https://github.com/duckdblabs/duckdb_iceberg) is a loadable extension that implements support for the [Apache Iceberg format](https://iceberg.apache.org/). -> This extension currently only works on main branch of DuckDB. +The `iceberg` extension is a loadable extension that implements support for the [Apache Iceberg format](https://iceberg.apache.org/). + +> This extension currently only works on the `main` branch of DuckDB (bleeding edge releases). + +## Source Code + +[GitHub](https://github.com/duckdblabs/postgres_scanner) diff --git a/docs/extensions/official_extensions.md b/docs/extensions/official_extensions.md new file mode 100644 index 00000000000..db664c0bb3f --- /dev/null +++ b/docs/extensions/official_extensions.md @@ -0,0 +1,26 @@ +--- +layout: docu +title: Official Extensions +--- + +| Extension name | Description | Aliases | +|---|-----|--| +| arrow [GitHub](https://github.com/duckdblabs/arrow) | A zero-copy data integration between Apache Arrow and DuckDB | | +| autocomplete | Adds support for autocomplete in the shell | | +| aws | Provides features that depend on the AWS SDK | | +| azure | Adds a filesystem abstraction for Azure blob storage to DuckDB | | +| [excel](excel) | Adds support for Excel-like format strings | | +| [fts](full_text_search) | Adds support for Full-Text Search Indexes | | +| [httpfs](httpfs) | Adds support for reading and writing files over a HTTP(S) connection | http, https, s3 | +| [iceberg](iceberg) [GitHub](https://github.com/duckdblabs/duckdb_iceberg) | Adds support for Apache Iceberg | | +| icu | Adds support for time zones and collations using the ICU library | | +| inet | Adds support for IP-related data types and functions | | +| jemalloc | Overwrites system allocator with JEMalloc | | +| [json](json) | Adds support for JSON operations | | +| parquet | Adds support for reading and writing parquet files | | +| [postgres_scanner](postgres_scanner) [GitHub](https://github.com/duckdblabs/postgres_scanner) | Adds support for reading from a Postgres database | postgres | +| [spatial](spatial) [GitHub](https://github.com/duckdblabs/duckdb_spatial) | Geospatial extension that adds support for working with spatial data and functions | | +| [sqlite_scanner](sqlite_scanner) [GitHub](https://github.com/duckdblabs/sqlite_scanner) | Adds support for reading SQLite database files | sqlite, sqlite3 | +| [substrait](substrait) [GitHub](https://github.com/duckdblabs/substrait) | Adds support for the Substrait integration | | +| tpcds | Adds TPC-DS data generation and query support | | +| tpch | Adds TPC-H data generation and query support | | diff --git a/docs/extensions/overview.md b/docs/extensions/overview.md index 478f6d48b48..321b503b8cf 100644 --- a/docs/extensions/overview.md +++ b/docs/extensions/overview.md @@ -11,6 +11,8 @@ These may extend DuckDB's functionality by providing support for additional file > Extensions are loadable on all clients (e.g., Python and R). > Extensions distributed via the official repository are built and tested on MacOS (amd64 and arm64), Windows (amd64) and Linux (amd64 and arm64). +We maintain a [list of official extensions](official_extensions). + ## Using Extensions ### Listing Extensions @@ -83,32 +85,7 @@ Extensions are signed with a cryptographic key, which also simplifies distributi All extensions provided by the DuckDB core team are signed. If you wish to load your own extensions or extensions from third-parties you will need to enable the `allow_unsigned_extensions` flag. -To load unsigned extensions using the [CLI](../api/cli), pass the `-unsigned` flag to it on startup. - -### List of Official Extensions - -| Extension name | Description | Aliases | -|---|-----|--| -| arrow [GitHub](https://github.com/duckdblabs/arrow) | A zero-copy data integration between Apache Arrow and DuckDB | | -| autocomplete | Adds support for autocomplete in the shell | | -| aws | Provides features that depend on the AWS SDK | | -| azure | Adds a filesystem abstraction for Azure blob storage to DuckDB | | -| [excel](excel) | Adds support for Excel-like format strings | | -| [fts](full_text_search) | Adds support for Full-Text Search Indexes | | -| [httpfs](httpfs) | Adds support for reading and writing files over a HTTP(S) connection | http, https, s3 | -| [iceberg](iceberg) [GitHub](https://github.com/duckdblabs/duckdb_iceberg) | Adds support for Apache Iceberg | | -| icu | Adds support for time zones and collations using the ICU library | | -| inet | Adds support for IP-related data types and functions | | -| jemalloc | Overwrites system allocator with JEMalloc | | -| [json](json) | Adds support for JSON operations | | -| parquet | Adds support for reading and writing parquet files | | -| [postgres_scanner](postgres_scanner) [GitHub](https://github.com/duckdblabs/postgres_scanner) | Adds support for reading from a Postgres database | postgres | -| [spatial](spatial) [GitHub](https://github.com/duckdblabs/duckdb_spatial) | Geospatial extension that adds support for working with spatial data and functions | | -| [sqlite_scanner](sqlite_scanner) [GitHub](https://github.com/duckdblabs/sqlite_scanner) | Adds support for reading SQLite database files | sqlite, sqlite3 | -| [substrait](substrait) [GitHub](https://github.com/duckdblabs/substrait) | Adds support for the Substrait integration | | -| tpcds | Adds TPC-DS data generation and query support | | -| tpch | Adds TPC-H data generation and query support | | - +To load unsigned extensions using the [CLI client](../api/cli), pass the `-unsigned` flag to it on startup. ### Developing Extensions diff --git a/docs/extensions/postgres_scanner.md b/docs/extensions/postgres_scanner.md index 336295537f6..a8d3ce5d54c 100644 --- a/docs/extensions/postgres_scanner.md +++ b/docs/extensions/postgres_scanner.md @@ -3,7 +3,7 @@ layout: docu title: PostgreSQL Scanner --- -The `postgres` extension allows DuckDB to directly read data from a running PostgreSQL instance. The data can be queried directly from the underlying PostgreSQL tables, or read into DuckDB tables. +The `postgres` extension allows DuckDB to directly read data from a running PostgreSQL instance. The data can be queried directly from the underlying PostgreSQL tables, or read into DuckDB tables. See the [official announcement](/2022/09/30/postgres-scanner) for implementation details and background. ## Loading the Extension @@ -16,21 +16,21 @@ LOAD postgres; ## Usage -To make a PostgreSQL database accessible to DuckDB, use the `POSTGRES_ATTACH` command: +To make a PostgreSQL database accessible to DuckDB, use the `postgres_attach` command: ```sql -- load all data from "public" schema of the postgres instance running on localhost into the schema "main" -CALL POSTGRES_ATTACH(''); +CALL postgres_attach(''); -- attach the database with the given schema, loading tables from the source schema "public" into the target schema "abc" CALL postgres_attach('dbname=postgres user=postgres host=127.0.0.1', source_schema='public', sink_schema='abc'); ``` -`POSTGRES_ATTACH` takes a single required string parameter, which is the [`libpq` connection string](https://www.postgresql.org/docs/current/libpq-connect.html#LIBPQ-CONNSTRING). For example you can pass `'dbname=postgresscanner'` to select a different database name. In the simplest case, the parameter is just `''`. There are three additional named parameters: +`postgres_attach` takes a single required string parameter, which is the [`libpq` connection string](https://www.postgresql.org/docs/current/libpq-connect.html#LIBPQ-CONNSTRING). For example you can pass `'dbname=postgresscanner'` to select a different database name. In the simplest case, the parameter is just `''`. There are three additional named parameters: -* `source_schema` the name of a non-standard schema name in PostgreSQL to get tables from. Default is `public`. -* `sink_schema` the schema name in DuckDB to create views. Default is `main`. -* `overwrite` whether we should overwrite existing views in the target schema, default is `false`. -* `filter_pushdown` whether filter predicates that DuckDB derives from the query should be forwarded to PostgreSQL, defaults to `false`. +* `source_schema` the name of a non-standard schema name in PostgreSQL to get tables from. Default: `public`. +* `sink_schema` the schema name in DuckDB to create views. Default: `main`. +* `overwrite` whether we should overwrite existing views in the target schema. Default: `false`. +* `filter_pushdown` whether filter predicates that DuckDB derives from the query should be forwarded to PostgreSQL. Default: `false`. The tables in the database are registered as views in DuckDB, you can list them as follows: @@ -43,16 +43,16 @@ Then you can query those views normally using SQL. ## Querying Individual Tables -If you prefer to not attach all tables, but just query a single table, that is possible using the `POSTGRES_SCAN` function, e.g. +If you prefer to not attach all tables, but just query a single table, that is possible using the `postgres_scan` function, e.g.: ```sql -SELECT * FROM POSTGRES_SCAN('', 'public', 'mytable'); +SELECT * FROM postgres_scan('', 'public', 'mytable'); ``` -`POSTGRES_SCAN` takes three string parameters, the `libpq` connection string (see above), a PostgreSQL schema name and a table name. The schema name is often `public`. +The `postgres_scan` function takes three string parameters, the `libpq` connection string (see above), a PostgreSQL schema name and a table name. The schema often used in PostgreSQL is `public`. -To use `filter_pushdown` use the `POSTGRES_SCAN_PUSHDOWN` function. +To use `filter_pushdown` use the `postgres_scan_pushdown` function. -## Extra Information +## Source Code -See [the repo](https://github.com/duckdblabs/postgres_scanner) for the source code of the extension, or the [official announcement](/2022/09/30/postgres-scanner) for implementation details and background. +[GitHub](https://github.com/duckdblabs/postgres_scanner) diff --git a/docs/extensions/spatial.md b/docs/extensions/spatial.md index 2020c36484e..f335cb0fde3 100644 --- a/docs/extensions/spatial.md +++ b/docs/extensions/spatial.md @@ -2,7 +2,9 @@ layout: docu title: Spatial --- + The `spatial` extension provides support for geospatial data processing in DuckDB. +For an overview of the extension, see our [blog post](/2023/04/28/spatial). ## GEOMETRY type @@ -26,7 +28,7 @@ The spatial extension implements a large number of scalar functions and overload 🦆 - DuckDB - functions that are implemented natively in this extension that are capable of operating directly on the DuckDB types -🔄 - CAST(GEOMETRY) - functions that are supported by implicitly casting to `GEOMETRY` and then using the `GEOMETRY` implementation +🔄 - `CAST(GEOMETRY)` - functions that are supported by implicitly casting to `GEOMETRY` and then using the `GEOMETRY` implementation The currently implemented spatial functions can roughly be categorized into the following groups: @@ -36,12 +38,12 @@ Convert between geometries and other formats. | Scalar functions | GEOMETRY | POINT_2D | LINESTRING_2D | POLYGON_2D | BOX_2D | |-----|---|--|--|--|---| -| VARCHAR ST_AsGeoJSON(GEOMETRY) | 🦆 | 🦆 | 🦆 | 🦆 | 🔄 (as POLYGON) | -| VARCHAR ST_AsHEXWKB(GEOMETRY) | 🦆 | 🦆 | 🦆 | 🦆 | 🦆 | -| VARCHAR ST_AsText(GEOMETRY) | 🧭 | 🦆 | 🦆 | 🦆 | 🔄 (as POLYGON) | -| WKB_BLOB ST_AsWKB(GEOMETRY) | 🦆 | 🦆 | 🦆 | 🦆 | 🦆 | -| GEOMETRY ST_GeomFromText(VARCHAR) | 🧭 | 🔄 | 🔄 | 🔄 | 🔄 (as POLYGON) | -| GEOMETRY ST_GeomFromWKB(BLOB) | 🦆 | 🦆 | 🦆 | 🦆 | 🔄 (as POLYGON) | +| `VARCHAR ST_AsGeoJSON(GEOMETRY)` | 🦆 | 🦆 | 🦆 | 🦆 | 🔄 (as `POLYGON`) | +| `VARCHAR ST_AsHEXWKB(GEOMETRY)` | 🦆 | 🦆 | 🦆 | 🦆 | 🦆 | +| `VARCHAR ST_AsText(GEOMETRY)` | 🧭 | 🦆 | 🦆 | 🦆 | 🔄 (as `POLYGON`) | +| `WKB_BLOB ST_AsWKB(GEOMETRY)` | 🦆 | 🦆 | 🦆 | 🦆 | 🦆 | +| `GEOMETRY ST_GeomFromText(VARCHAR)` | 🧭 | 🔄 | 🔄 | 🔄 | 🔄 (as `POLYGON`) | +| `GEOMETRY ST_GeomFromWKB(BLOB)` | 🦆 | 🦆 | 🦆 | 🦆 | 🔄 (as `POLYGON`) | ### Geometry Construction @@ -49,21 +51,21 @@ Construct new geometries from other geometries or other data. | Scalar functions | GEOMETRY | POINT_2D | LINESTRING_2D | POLYGON_2D | BOX_2D | |-----|---|--|--|--|---| -| GEOMETRY ST_Point(DOUBLE, DOUBLE) | 🦆 | 🦆 | | | | -| GEOMETRY ST_ConvexHull(GEOMETRY) | 🧭 | 🔄 | 🔄 | 🔄 | 🔄 (as POLYGON) | -| GEOMETRY ST_Boundary(GEOMETRY) | 🧭 | 🔄 | 🔄 | 🔄 | 🔄 (as POLYGON) | -| GEOMETRY ST_Buffer(GEOMETRY) | 🧭 | 🔄 | 🔄 | 🔄 | 🔄 (as POLYGON) | -| GEOMETRY ST_Centroid(GEOMETRY) | 🧭 | 🦆 | 🦆 | 🦆 | 🦆 | -| GEOMETRY ST_Collect(GEOMETRY[]) | 🦆 | 🦆 | 🦆 | 🦆 | 🦆 | -| GEOMETRY ST_Normalize(GEOMETRY) | 🧭 | 🔄 | 🔄 | 🔄 | 🔄 (as POLYGON) | -| GEOMETRY ST_SimplifyPreserveTopology(GEOMETRY, DOUBLE) | 🧭 | 🔄 | 🔄 | 🔄 | 🔄 (as POLYGON) | -| GEOMETRY ST_Simplify(GEOMETRY, DOUBLE) | 🧭 | 🔄 | 🔄 | 🔄 | 🔄 (as POLYGON) | -| GEOMETRY ST_Union(GEOMETRY, GEOMETRY) | 🧭 | 🔄 | 🔄 | 🔄 | 🔄 (as POLYGON) | -| GEOMETRY ST_Intersection(GEOMETRY, GEOMETRY) | 🧭 | 🔄 | 🔄 | 🔄 | 🔄 (as POLYGON) | -| GEOMETRY ST_MakeLine(GEOMETRY[]) | 🦆 | | 🦆 | | | -| GEOMETRY ST_Envelope(GEOMETRY) | 🧭 | 🔄 | 🔄 | 🔄 | 🔄 (as POLYGON) | -| GEOMETRY ST_FlipCoordinates(GEOMETRY) | 🦆 | 🦆 | 🦆 | 🦆 | 🦆 | -| GEOMETRY ST_Transform(GEOMETRY, VARCHAR, VARCHAR) | 🦆 | 🦆 | 🦆 | 🦆 | 🦆 | +| `GEOMETRY ST_Point(DOUBLE, DOUBLE)` | 🦆 | 🦆 | | | | +| `GEOMETRY ST_ConvexHull(GEOMETRY)` | 🧭 | 🔄 | 🔄 | 🔄 | 🔄 (as `POLYGON`) | +| `GEOMETRY ST_Boundary(GEOMETRY)` | 🧭 | 🔄 | 🔄 | 🔄 | 🔄 (as `POLYGON`) | +| `GEOMETRY ST_Buffer(GEOMETRY)` | 🧭 | 🔄 | 🔄 | 🔄 | 🔄 (as `POLYGON`) | +| `GEOMETRY ST_Centroid(GEOMETRY)` | 🧭 | 🦆 | 🦆 | 🦆 | 🦆 | +| `GEOMETRY ST_Collect(GEOMETRY[]) ` | 🦆 | 🦆 | 🦆 | 🦆 | 🦆 | +| `GEOMETRY ST_Normalize(GEOMETRY)` | 🧭 | 🔄 | 🔄 | 🔄 | 🔄 (as `POLYGON`) | +| `GEOMETRY ST_SimplifyPreserveTopology(GEOMETRY, DOUBLE)` | 🧭 | 🔄 | 🔄 | 🔄 | 🔄 (as `POLYGON`) | +| `GEOMETRY ST_Simplify(GEOMETRY, DOUBLE)` | 🧭 | 🔄 | 🔄 | 🔄 | 🔄 (as `POLYGON`) | +| `GEOMETRY ST_Union(GEOMETRY, GEOMETRY)` | 🧭 | 🔄 | 🔄 | 🔄 | 🔄 (as `POLYGON`) | +| `GEOMETRY ST_Intersection(GEOMETRY, GEOMETRY)` | 🧭 | 🔄 | 🔄 | 🔄 | 🔄 (as `POLYGON`) | +| `GEOMETRY ST_MakeLine(GEOMETRY[]) ` | 🦆 | | 🦆 | | | +| `GEOMETRY ST_Envelope(GEOMETRY)` | 🧭 | 🔄 | 🔄 | 🔄 | 🔄 (as `POLYGON`) | +| `GEOMETRY ST_FlipCoordinates(GEOMETRY)` | 🦆 | 🦆 | 🦆 | 🦆 | 🦆 | +| `GEOMETRY ST_Transform(GEOMETRY, VARCHAR, VARCHAR)` | 🦆 | 🦆 | 🦆 | 🦆 | 🦆 | ### Spatial Properties @@ -72,16 +74,16 @@ Calculate and access spatial properties of geometries. | Scalar functions | GEOMETRY | POINT_2D | LINESTRING_2D | POLYGON_2D | BOX_2D | |-----|---|--|--|--|---| -| DOUBLE ST_Area(GEOMETRY) | 🦆 | 🦆 | 🦆 | 🦆 | 🦆 | -| BOOLEAN ST_IsClosed(GEOMETRY) | 🧭 | 🔄 | 🔄 | 🔄 | 🔄 (as POLYGON) | -| BOOLEAN ST_IsEmpty(GEOMETRY) | 🦆 | 🦆 | 🦆 | 🦆 | 🔄 (as POLYGON) | -| BOOLEAN ST_IsRing(GEOMETRY) | 🧭 | 🔄 | 🔄 | 🔄 | 🔄 (as POLYGON) | -| BOOLEAN ST_IsSimple(GEOMETRY) | 🧭 | 🔄 | 🔄 | 🔄 | 🔄 (as POLYGON) | -| BOOLEAN ST_IsValid(GEOMETRY) | 🧭 | 🔄 | 🔄 | 🔄 | 🔄 (as POLYGON) | -| DOUBLE ST_X(GEOMETRY) | 🧭 | 🦆 | 🔄 | 🔄 | 🔄 (as POLYGON) | -| DOUBLE ST_Y(GEOMETRY) | 🧭 | 🦆 | 🔄 | 🔄 | 🔄 (as POLYGON) | -| GeometryType ST_GeometryType(GEOMETRY) | 🦆 | 🦆 | 🦆 | 🦆 | 🔄 (as POLYGON) | -| DOUBLE ST_Length(GEOMETRY) | 🦆 | 🦆 | 🦆 | 🦆 | 🔄 (as POLYGON) | +| `DOUBLE ST_Area(GEOMETRY)` | 🦆 | 🦆 | 🦆 | 🦆 | 🦆 | +| `BOOLEAN ST_IsClosed(GEOMETRY)` | 🧭 | 🔄 | 🔄 | 🔄 | 🔄 (as `POLYGON`) | +| `BOOLEAN ST_IsEmpty(GEOMETRY)` | 🦆 | 🦆 | 🦆 | 🦆 | 🔄 (as `POLYGON`) | +| `BOOLEAN ST_IsRing(GEOMETRY)` | 🧭 | 🔄 | 🔄 | 🔄 | 🔄 (as `POLYGON`) | +| `BOOLEAN ST_IsSimple(GEOMETRY)` | 🧭 | 🔄 | 🔄 | 🔄 | 🔄 (as `POLYGON`) | +| `BOOLEAN ST_IsValid(GEOMETRY)` | 🧭 | 🔄 | 🔄 | 🔄 | 🔄 (as `POLYGON`) | +| `DOUBLE ST_X(GEOMETRY)` | 🧭 | 🦆 | 🔄 | 🔄 | 🔄 (as `POLYGON`) | +| `DOUBLE ST_Y(GEOMETRY)` | 🧭 | 🦆 | 🔄 | 🔄 | 🔄 (as `POLYGON`) | +| `GeometryType ST_GeometryType(GEOMETRY)` | 🦆 | 🦆 | 🦆 | 🦆 | 🔄 (as `POLYGON`) | +| `DOUBLE ST_Length(GEOMETRY)` | 🦆 | 🦆 | 🦆 | 🦆 | 🔄 (as `POLYGON`) | ### Spatial Relationships @@ -90,19 +92,19 @@ Compute relationships and spatial predicates between geometries. | Scalar functions | GEOMETRY | POINT_2D | LINESTRING_2D | POLYGON_2D | BOX_2D | |-----|---|--|--|--|---| -| BOOLEAN ST_Within(GEOMETRY, GEOMETRY) | 🧭 | 🦆 or 🔄 | 🔄 | 🔄 | 🔄 (as POLYGON) | -| BOOLEAN ST_Touches(GEOMETRY, GEOMETRY) | 🧭 | 🔄 | 🔄 | 🔄 | 🔄 (as POLYGON) | -| BOOLEAN ST_Overlaps(GEOMETRY, GEOMETRY) | 🧭 | 🔄 | 🔄 | 🔄 | 🔄 (as POLYGON) | -| BOOLEAN ST_Contains(GEOMETRY, GEOMETRY) | 🧭 | 🔄 | 🔄 | 🦆 or 🔄 | 🔄 (as POLYGON) | -| BOOLEAN ST_CoveredBy(GEOMETRY, GEOMETRY) | 🧭 | 🔄 | 🔄 | 🔄 | 🔄 (as POLYGON) | -| BOOLEAN ST_Covers(GEOMETRY, GEOMETRY) | 🧭 | 🔄 | 🔄 | 🔄 | 🔄 (as POLYGON) | -| BOOLEAN ST_Crosses(GEOMETRY, GEOMETRY) | 🧭 | 🔄 | 🔄 | 🔄 | 🔄 (as POLYGON) | -| BOOLEAN ST_Difference(GEOMETRY, GEOMETRY) | 🧭 | 🔄 | 🔄 | 🔄 | 🔄 (as POLYGON) | -| BOOLEAN ST_Disjoint(GEOMETRY, GEOMETRY) | 🧭 | 🔄 | 🔄 | 🔄 | 🔄 (as POLYGON) | -| BOOLEAN ST_Intersects(GEOMETRY, GEOMETRY) | 🧭 | 🔄 | 🔄 | 🔄 | 🔄 (as POLYGON) | -| BOOLEAN ST_Equals(GEOMETRY, GEOMETRY) | 🧭 | 🔄 | 🔄 | 🔄 | 🔄 (as POLYGON) | -| DOUBLE ST_Distance(GEOMETRY, GEOMETRY) | 🧭 | 🦆 or 🔄 | 🦆 or 🔄 | 🔄 | 🔄 (as POLYGON) | -| BOOLEAN ST_DWithin(GEOMETRY, GEOMETRY, DOUBLE) | 🧭 | 🔄 | 🔄 | 🔄 | 🔄 (as POLYGON) | +| `BOOLEAN ST_Within(GEOMETRY, GEOMETRY)` | 🧭 | 🦆 or 🔄 | 🔄 | 🔄 | 🔄 (as `POLYGON`) | +| `BOOLEAN ST_Touches(GEOMETRY, GEOMETRY)` | 🧭 | 🔄 | 🔄 | 🔄 | 🔄 (as `POLYGON`) | +| `BOOLEAN ST_Overlaps(GEOMETRY, GEOMETRY)` | 🧭 | 🔄 | 🔄 | 🔄 | 🔄 (as `POLYGON`) | +| `BOOLEAN ST_Contains(GEOMETRY, GEOMETRY)` | 🧭 | 🔄 | 🔄 | 🦆 or 🔄 | 🔄 (as `POLYGON`) | +| `BOOLEAN ST_CoveredBy(GEOMETRY, GEOMETRY)` | 🧭 | 🔄 | 🔄 | 🔄 | 🔄 (as `POLYGON`) | +| `BOOLEAN ST_Covers(GEOMETRY, GEOMETRY)` | 🧭 | 🔄 | 🔄 | 🔄 | 🔄 (as `POLYGON`) | +| `BOOLEAN ST_Crosses(GEOMETRY, GEOMETRY)` | 🧭 | 🔄 | 🔄 | 🔄 | 🔄 (as `POLYGON`) | +| `BOOLEAN ST_Difference(GEOMETRY, GEOMETRY)` | 🧭 | 🔄 | 🔄 | 🔄 | 🔄 (as `POLYGON`) | +| `BOOLEAN ST_Disjoint(GEOMETRY, GEOMETRY)` | 🧭 | 🔄 | 🔄 | 🔄 | 🔄 (as `POLYGON`) | +| `BOOLEAN ST_Intersects(GEOMETRY, GEOMETRY)` | 🧭 | 🔄 | 🔄 | 🔄 | 🔄 (as `POLYGON`) | +| `BOOLEAN ST_Equals(GEOMETRY, GEOMETRY)` | 🧭 | 🔄 | 🔄 | 🔄 | 🔄 (as `POLYGON`) | +| `DOUBLE ST_Distance(GEOMETRY, GEOMETRY)` | 🧭 | 🦆 or 🔄 | 🦆 or 🔄 | 🔄 | 🔄 (as `POLYGON`) | +| `BOOLEAN ST_DWithin(GEOMETRY, GEOMETRY, DOUBLE)` | 🧭 | 🔄 | 🔄 | 🔄 | 🔄 (as `POLYGON`) | ## Spatial Table Functions @@ -193,10 +195,11 @@ For example to export a table to a GeoJSON file, with generated bounding boxes, COPY TO 'some/file/path/filename.geojson' WITH (FORMAT GDAL, DRIVER 'GeoJSON', LAYER_CREATION_OPTIONS 'WRITE_BBOX=YES'); ``` -- `FORMAT`: is the only required option and must be set to `GDAL` to use the GDAL based copy function. -- `DRIVER`: is the GDAL driver to use for the export. See the table above for a list of available drivers. -- `LAYER_CREATION_OPTIONS`: list of options to pass to the GDAL driver. See the GDAL docs for the driver you are using for a list of available options. -## Extra Information +* `FORMAT`: is the only required option and must be set to `GDAL` to use the GDAL based copy function. +* `DRIVER`: is the GDAL driver to use for the export. See the table above for a list of available drivers. +* `LAYER_CREATION_OPTIONS`: list of options to pass to the GDAL driver. See the GDAL docs for the driver you are using for a list of available options. + +## Source Code -See [the repo](https://github.com/duckdblabs/duckdb_spatial) for the source code of the extension, or the [blog post](/2023/04/28/spatial). +[GitHub](https://github.com/duckdblabs/duckdb_spatial) diff --git a/docs/extensions/sqlite_scanner.md b/docs/extensions/sqlite_scanner.md index ecde7cb077c..f9383eb32bd 100644 --- a/docs/extensions/sqlite_scanner.md +++ b/docs/extensions/sqlite_scanner.md @@ -63,7 +63,7 @@ Then you can query those views normally using SQL, e.g., using the example queri ```sql SELECT cat.name category_name, - Sum(Ifnull(pay.amount, 0)) revenue + sum(ifnull(pay.amount, 0)) revenue FROM category cat LEFT JOIN film_category flm_cat ON cat.category_id = flm_cat.category_id @@ -111,7 +111,9 @@ When querying SQLite, DuckDB must deduce a specific column type mapping. DuckDB As DuckDB enforces the corresponding columns to contain only correctly typed values, we cannot load the string "hello" into a column of type `BIGINT`. As such, an error is thrown when reading from the "numbers" table above: -> Error: Mismatch Type Error: Invalid type in column "i": column was declared as integer, found "hello" of type "text" instead. +```text +Error: Mismatch Type Error: Invalid type in column "i": column was declared as integer, found "hello" of type "text" instead. +``` This error can be avoided by setting the `sqlite_all_varchar` option: @@ -129,6 +131,6 @@ If you want to run the `sqlite_scan` procedure more than once in the same DuckDB CALL sqlite_attach('sakila.db', overwrite=true); ``` -## Extra Information +## Source Code -See [the repo](https://github.com/duckdblabs/sqlite_scanner) for the source code of the extension. +[GitHub](https://github.com/duckdblabs/sqlite_scanner) diff --git a/docs/extensions/substrait.md b/docs/extensions/substrait.md index 2a3adbb28de..58c49d7da27 100644 --- a/docs/extensions/substrait.md +++ b/docs/extensions/substrait.md @@ -3,13 +3,11 @@ layout: docu title: Substrait --- -The main goal of this extension is to support both production and consumption of Substrait query plans in DuckDB. +The main goal of the `substrait` extension is to support both production and consumption of Substrait query plans in DuckDB. This extension is mainly exposed via 3 different APIs - the SQL API, the Python API, and the R API. Here we depict how to consume and produce Substrait query plans in each API. -Additionally, see the [`substrait` repository](https://github.com/duckdblabs/substrait) for further usage details. - > The Substrait integration is currently experimental. Support is currently only available on request. > If you have not asked for permission to ask for support, [contact us prior to opening an issue](https://duckdblabs.com/contact/). > If you open an issue without doing so, we will close it without further review. @@ -130,3 +128,7 @@ To consume a Substrait BLOB the `duckdb_prepare_substrait(con, blob)` function m result <- duckdb::duckdb_prepare_substrait(con, proto_bytes) df <- dbFetch(result) ``` + +## Source Code + +[GitHub](https://github.com/duckdblabs/substrait) From 017e406b13e3362a1d6d4c6a0ddf9908006e5ccf Mon Sep 17 00:00:00 2001 From: Gabor Szarnyas Date: Thu, 21 Sep 2023 10:02:40 +0200 Subject: [PATCH 2/7] Updates to Wasm extension page --- docs/api/wasm/extensions.md | 42 +++++++++++++++++++------------------ 1 file changed, 22 insertions(+), 20 deletions(-) diff --git a/docs/api/wasm/extensions.md b/docs/api/wasm/extensions.md index 2203ac6f29f..ae43f464d1e 100644 --- a/docs/api/wasm/extensions.md +++ b/docs/api/wasm/extensions.md @@ -3,24 +3,26 @@ layout: docu title: Extensions --- -DuckDB-Wasm (dynamic) extension loading is modeled after regular DuckDB extension loading, with a few relevant differences due to the difference in platform. +DuckDB-Wasm's (dynamic) extension loading is modeled after the regular DuckDB's extension loading, with a few relevant differences due to the difference in platform. ### Format -Extensions in DuckDB are binaries to be dynamically loaded via dlopen. A cryptographical signature is appended to the binary. -Extensions in DuckDB-Wasm are a regular Wasm file to be dynamically loaded via Emscripten's dlopen. A cryptographical signature is appended to the Wasm file as a WebAssembly custom section called `duckdb_signature`. -This ensures the file remanins a valid WebAssembly file. Currently we require this custom section to be the last one, but this can be potentially relaxed in the future. +Extensions in DuckDB are binaries to be dynamically loaded via `dlopen`. A cryptographical signature is appended to the binary. +An extension in DuckDB-Wasm is a regular Wasm file to be dynamically loaded via Emscripten's dlopen. A cryptographical signature is appended to the Wasm file as a WebAssembly custom section called `duckdb_signature`. +This ensures the file remains a valid WebAssembly file. + +> Currently we require this custom section to be the last one, but this can be potentially relaxed in the future. ### INSTALL and LOAD -INSTALL semantic in native embeddings of DuckDB is to fetch, decompress from gzip and store data in local disk. -LOAD semantic in native embeddings of DuckDB is to (optionally) perform signature checks AND dynamic load the binary with the main DuckDB binary. +The `INSTALL` semantic in native embeddings of DuckDB is to fetch, decompress from `gzip` and store data in local disk. +The `LOAD` semantic in native embeddings of DuckDB is to (optionally) perform signature checks *and* dynamic load the binary with the main DuckDB binary. -In DuckDB-Wasm, INSTALL is a no-op given there is no durable cross-session storage. LOAD will fetch (and decompress on the fly), perform signature checks *and* dynamically load via the Emscripten implementation of dlopen. +In DuckDB-Wasm, `INSTALL` is a no-op given there is no durable cross-session storage. The `LOAD` operation will fetch (and decompress on the fly), perform signature checks *and* dynamically load via the Emscripten implementation of dlopen. ### Autoloading -Autoloading, so the possibility for DuckDB to add extension functionality on-the-fly, is enabled by default in DuckDB-Wasm. +[Autoloading](../../extensions/overview), i.e., the possibility for DuckDB to add extension functionality on-the-fly, is enabled by default in DuckDB-Wasm. ### List of Officially Available Extensions @@ -39,38 +41,38 @@ Autoloading, so the possibility for DuckDB to add extension functionality on-the | tpcds | Adds TPC-DS data generation and query support | | | tpch | Adds TPC-H data generation and query support | | -WebAssembly is basically an additional platform, and there might be platform specific limitations that make some extensions not able to match their native capabilities or to perform them in a different way. We will document here relevant differences for DuckDB-hosted extensions. +WebAssembly is basically an additional platform, and there might be platform-specific limitations that make some extensions not able to match their native capabilities or to perform them in a different way. We will document here relevant differences for DuckDB-hosted extensions. #### HTTPFS -HTTPFS extension is, at the moment, not available in DuckDB-Wasm. Https protocol capabilities needs to go through an additional layer, the browser, that adds both differences and some restrictions to what's doable from native. +The HTTPFS extension is, at the moment, not available in DuckDB-Wasm. Https protocol capabilities needs to go through an additional layer, the browser, which adds both differences and some restrictions to what is doable from native. -### Extension signing +### Extension Signing -As with regular DuckDB extensions, DuckDB-Wasm extension are by default checked on LOAD to verify the signature confirm the extension has not been tampered with. +As with regular DuckDB extensions, DuckDB-Wasm extension are by default checked on `LOAD` to verify the signature confirm the extension has not been tampered with. Extension signature verification can be disabled via a configuration option. -Signing is a property of the binary itself, so copying a DuckDB extension (say to serve it from a different location) will still keep a valid signature (for example for local development). +Signing is a property of the binary itself, so copying a DuckDB extension (say to serve it from a different location) will still keep a valid signature (e.g., for local development). -### Fetching DuckDB-Wasm extensions +### Fetching DuckDB-Wasm Extensions -DuckDB official extension are served at extensions.duckdb.org, and this is also the default value for the `default_extension_repository` option. -On installing extensions, a relevant URL will be built that will look like `extensions.duckdb.org/$duckdb_version_hash/$duckdb_platform/$name.duckdb_extension.gz`. +Official DuckDB extensions are served at `extensions.duckdb.org`, and this is also the default value for the `default_extension_repository` option. +When installing extensions, a relevant URL will be built that will look like `extensions.duckdb.org/$duckdb_version_hash/$duckdb_platform/$name.duckdb_extension.gz`. DuckDB-Wasm extension are fetched only on load, and the URL will look like: `extensions.duckdb.org/duckdb-wasm/$duckdb_version_hash/$duckdb_platform/$name.duckdb_extension.wasm`. Note that an additional `duckdb-wasm` is added to the folder structure, and the file is served as a `.wasm` file. -DuckDB-Wasm extension are served pre-compressed using brotli compression. While fetched from a browser, extensions will be transparently uncompressed. If you want to fetch duckdb-wasm extension manually, you can use `curl --compress extensions.duckdb.org/......../icu.duckdb_extension.wasm`. +DuckDB-Wasm extensions are served pre-compressed using Brotli compression. While fetched from a browser, extensions will be transparently uncompressed. If you want to fetch the `duckdb-wasm` extension manually, you can use `curl --compress extensions.duckdb.org/<...>/icu.duckdb_extension.wasm`. -### Serving extension from a third party repository +### Serving Extensions from a Third-Party Repository As with regular DuckDB, if you use `SET custom_extension_repository = some.url.com`, subsequent loads will be attempted at `some.url.com/duckdb-wasm/$duckdb_version_hash/$duckdb_platform/$name.duckdb_extension.wasm`. -Note that GET requests on the extensions needs to be CORS enabled for a browser to allow the connection. +Note that GET requests on the extensions needs to be [CORS enabled](https://www.w3.org/wiki/CORS_Enabled) for a browser to allow the connection. ### Tooling -Both extensions and the deployed DuckDB have been compiled using Emscripten 3.1.45. +Both DuckDB-Wasm and its extensions have been compiled using Emscripten 3.1.45. {% include iframe.html src="https://shell.duckdb.org" %} From 5dc2047b6126a7d4d95933bfa98d1a2ba4d1a423 Mon Sep 17 00:00:00 2001 From: Gabor Szarnyas Date: Thu, 21 Sep 2023 10:27:33 +0200 Subject: [PATCH 3/7] Update docs/extensions/sqlite_scanner.md Co-authored-by: Carlo Piovesan --- docs/extensions/sqlite_scanner.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/extensions/sqlite_scanner.md b/docs/extensions/sqlite_scanner.md index f9383eb32bd..3edff5ca295 100644 --- a/docs/extensions/sqlite_scanner.md +++ b/docs/extensions/sqlite_scanner.md @@ -131,6 +131,6 @@ If you want to run the `sqlite_scan` procedure more than once in the same DuckDB CALL sqlite_attach('sakila.db', overwrite=true); ``` -## Source Code +## GitHub Repository [GitHub](https://github.com/duckdblabs/sqlite_scanner) From 5ad11061e65702f2f950ee91275b38f5036e970d Mon Sep 17 00:00:00 2001 From: Gabor Szarnyas Date: Thu, 21 Sep 2023 10:27:47 +0200 Subject: [PATCH 4/7] Update docs/api/wasm/extensions.md Co-authored-by: Carlo Piovesan --- docs/api/wasm/extensions.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/api/wasm/extensions.md b/docs/api/wasm/extensions.md index ae43f464d1e..0991290287d 100644 --- a/docs/api/wasm/extensions.md +++ b/docs/api/wasm/extensions.md @@ -72,7 +72,7 @@ Note that GET requests on the extensions needs to be [CORS enabled](https://www. ### Tooling -Both DuckDB-Wasm and its extensions have been compiled using Emscripten 3.1.45. +Both DuckDB-Wasm and its extensions have been compiled using latest packaged Emscripten toolchain. {% include iframe.html src="https://shell.duckdb.org" %} From cfbaf6988b5bdafd42635d75ab97a23f3f0e7efa Mon Sep 17 00:00:00 2001 From: Gabor Szarnyas Date: Thu, 21 Sep 2023 10:28:07 +0200 Subject: [PATCH 5/7] Format extension names consistently --- docs/extensions/httpfs.md | 8 ++++---- docs/extensions/json.md | 3 ++- 2 files changed, 6 insertions(+), 5 deletions(-) diff --git a/docs/extensions/httpfs.md b/docs/extensions/httpfs.md index 92587418e7f..3194aa22cab 100644 --- a/docs/extensions/httpfs.md +++ b/docs/extensions/httpfs.md @@ -3,14 +3,14 @@ layout: docu title: httpfs --- -The __httpfs__ extension is a loadable extension implementing a file system that allows reading remote/writing remote files. For plain HTTP(S), only file reading is supported. For object storage using the S3 API, the __httpfs__ extension supports reading/writing/globbing files. +The `httpfs` extension is a loadable extension implementing a file system that allows reading remote/writing remote files. For plain HTTP(S), only file reading is supported. For object storage using the S3 API, the `httpfs` extension supports reading/writing/globbing files. Some clients come prebundled with this extension, in which case it's not necessary to first install or even load the extension. Depending on the client you use, no action may be required, or you might have to `INSTALL httpfs` on first use and use `LOAD httpfs` at the start of every session. ## HTTP(S) -With the __httpfs__ extension, it is possible to directly query files over the HTTP(S) protocol. This currently works for CSV, JSON, and Parquet files. +With the `httpfs` extension, it is possible to directly query files over the HTTP(S) protocol. This currently works for CSV, JSON, and Parquet files. ```sql SELECT * FROM 'https://domain.tld/file.extension'; @@ -40,11 +40,11 @@ SELECT * FROM parquet_scan(['https://domain.tld/file1.parquet', 'https://domain. ## S3 -The __httpfs__ extension supports reading/writing/globbing files on object storage servers using the S3 API. +The `httpfs` extension supports reading/writing/globbing files on object storage servers using the S3 API. ### Requirements -The __httpfs__ filesystem is tested with [AWS S3](https://aws.amazon.com/s3/), [Minio](https://min.io/), [Google cloud](https://cloud.google.com/storage/docs/interoperability), and [lakeFS](https://docs.lakefs.io/integrations/duckdb.html). Other services that implement the S3 API should also work, but not all features may be supported. Below is a list of which parts of the S3 API are required for each __httpfs__ feature. +The `httpfs` filesystem is tested with [AWS S3](https://aws.amazon.com/s3/), [Minio](https://min.io/), [Google cloud](https://cloud.google.com/storage/docs/interoperability), and [lakeFS](https://docs.lakefs.io/integrations/duckdb.html). Other services that implement the S3 API should also work, but not all features may be supported. Below is a list of which parts of the S3 API are required for each `httpfs` feature. | Feature | Required S3 API features | |:---|:---| diff --git a/docs/extensions/json.md b/docs/extensions/json.md index 68f69db654b..1744da7de0d 100644 --- a/docs/extensions/json.md +++ b/docs/extensions/json.md @@ -2,7 +2,8 @@ layout: docu title: JSON --- -The __json__ extension is a loadable extension that implements SQL functions that are useful for reading values from existing JSON, and creating new JSON data. + +The `json` extension is a loadable extension that implements SQL functions that are useful for reading values from existing JSON, and creating new JSON data. ## JSON Type From f398471e4c002c344b4f7bc1defb271abecf2a24 Mon Sep 17 00:00:00 2001 From: Gabor Szarnyas Date: Thu, 21 Sep 2023 10:28:55 +0200 Subject: [PATCH 6/7] Rename headers --- docs/extensions/iceberg.md | 2 +- docs/extensions/postgres_scanner.md | 2 +- docs/extensions/spatial.md | 2 +- docs/extensions/substrait.md | 2 +- 4 files changed, 4 insertions(+), 4 deletions(-) diff --git a/docs/extensions/iceberg.md b/docs/extensions/iceberg.md index 43b4024b220..4602a9d731c 100644 --- a/docs/extensions/iceberg.md +++ b/docs/extensions/iceberg.md @@ -7,6 +7,6 @@ The `iceberg` extension is a loadable extension that implements support for the > This extension currently only works on the `main` branch of DuckDB (bleeding edge releases). -## Source Code +## GitHub Repository [GitHub](https://github.com/duckdblabs/postgres_scanner) diff --git a/docs/extensions/postgres_scanner.md b/docs/extensions/postgres_scanner.md index a8d3ce5d54c..22059c86afa 100644 --- a/docs/extensions/postgres_scanner.md +++ b/docs/extensions/postgres_scanner.md @@ -53,6 +53,6 @@ The `postgres_scan` function takes three string parameters, the `libpq` connecti To use `filter_pushdown` use the `postgres_scan_pushdown` function. -## Source Code +## GitHub Repository [GitHub](https://github.com/duckdblabs/postgres_scanner) diff --git a/docs/extensions/spatial.md b/docs/extensions/spatial.md index f335cb0fde3..157d36055c4 100644 --- a/docs/extensions/spatial.md +++ b/docs/extensions/spatial.md @@ -200,6 +200,6 @@ WITH (FORMAT GDAL, DRIVER 'GeoJSON', LAYER_CREATION_OPTIONS 'WRITE_BBOX=YES'); * `DRIVER`: is the GDAL driver to use for the export. See the table above for a list of available drivers. * `LAYER_CREATION_OPTIONS`: list of options to pass to the GDAL driver. See the GDAL docs for the driver you are using for a list of available options. -## Source Code +## GitHub Repository [GitHub](https://github.com/duckdblabs/duckdb_spatial) diff --git a/docs/extensions/substrait.md b/docs/extensions/substrait.md index 58c49d7da27..9796ce8b9e5 100644 --- a/docs/extensions/substrait.md +++ b/docs/extensions/substrait.md @@ -129,6 +129,6 @@ result <- duckdb::duckdb_prepare_substrait(con, proto_bytes) df <- dbFetch(result) ``` -## Source Code +## GitHub Repository [GitHub](https://github.com/duckdblabs/substrait) From 9fdc16fd1d89b293209043182472b60011927419 Mon Sep 17 00:00:00 2001 From: Gabor Szarnyas Date: Thu, 21 Sep 2023 10:38:29 +0200 Subject: [PATCH 7/7] Spatial: Use consistent list bullets --- docs/extensions/spatial.md | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/docs/extensions/spatial.md b/docs/extensions/spatial.md index 157d36055c4..e35e4cfa8c5 100644 --- a/docs/extensions/spatial.md +++ b/docs/extensions/spatial.md @@ -118,13 +118,13 @@ CREATE TABLE
AS SELECT * FROM ST_Read('some/file/path/filename.json'); ```sql ST_Read(VARCHAR, sequential_layer_scan : BOOLEAN, spatial_filter : WKB_BLOB, open_options : VARCHAR[], layer : VARCHAR, allowed_drivers : VARCHAR[], sibling_files : VARCHAR[], spatial_filter_box : BOX_2D) ``` -- `sequential_layer_scan` (default: `false`): If set to `true`, the table function will scan through all layers sequentially and return the first layer that matches the given `layer` name. This is required for some drivers to work properly, e.g., the `OSM` driver. -- `spatial_filter` (default: `NULL`): If set to a WKB blob, the table function will only return rows that intersect with the given WKB geometry. Some drivers may support efficient spatial filtering natively, in which case it will be pushed down. Otherwise the filtering is done by GDAL which may be much slower. -- `open_options` (default: `[]`): A list of key-value pairs that are passed to the GDAL driver to control the opening of the file. E.g., the `GeoJSON` driver supports a `FLATTEN_NESTED_ATTRIBUTES=YES` option to flatten nested attributes. -- `layer` (default: `NULL`): The name of the layer to read from the file. If `NULL`, the first layer is returned. Can also be a layer index (starting at 0). -- `allowed_drivers` (default: `[]`): A list of GDAL driver names that are allowed to be used to open the file. If empty, all drivers are allowed. -- `sibling_files` (default: `[]`): A list of sibling files that are required to open the file. E.g., the `ESRI Shapefile` driver requires a `.shx` file to be present. Although most of the time these can be discovered automatically. -- `spatial_filter_box` (default: `NULL`): If set to a `BOX_2D`, the table function will only return rows that intersect with the given bounding box. Similar to `spatial_filter`. +* `sequential_layer_scan` (default: `false`): If set to `true`, the table function will scan through all layers sequentially and return the first layer that matches the given `layer` name. This is required for some drivers to work properly, e.g., the `OSM` driver. +* `spatial_filter` (default: `NULL`): If set to a WKB blob, the table function will only return rows that intersect with the given WKB geometry. Some drivers may support efficient spatial filtering natively, in which case it will be pushed down. Otherwise the filtering is done by GDAL which may be much slower. +* `open_options` (default: `[]`): A list of key-value pairs that are passed to the GDAL driver to control the opening of the file. E.g., the `GeoJSON` driver supports a `FLATTEN_NESTED_ATTRIBUTES=YES` option to flatten nested attributes. +* `layer` (default: `NULL`): The name of the layer to read from the file. If `NULL`, the first layer is returned. Can also be a layer index (starting at 0). +* `allowed_drivers` (default: `[]`): A list of GDAL driver names that are allowed to be used to open the file. If empty, all drivers are allowed. +* `sibling_files` (default: `[]`): A list of sibling files that are required to open the file. E.g., the `ESRI Shapefile` driver requires a `.shx` file to be present. Although most of the time these can be discovered automatically. +* `spatial_filter_box` (default: `NULL`): If set to a `BOX_2D`, the table function will only return rows that intersect with the given bounding box. Similar to `spatial_filter`. Note that GDAL is single-threaded, so this table function will not be able to make full use of parallelism. We're planning to implement support for the most common vector formats natively in this extension with additional table functions in the future.