Skip to content

Commit

Permalink
Merge pull request #1200 from szarnyasg/extension-pages
Browse files Browse the repository at this point in the history
Rework extension pages
  • Loading branch information
szarnyasg committed Sep 21, 2023
2 parents bfc2653 + 9fdc16f commit e55399f
Show file tree
Hide file tree
Showing 14 changed files with 195 additions and 175 deletions.
4 changes: 4 additions & 0 deletions _data/menu_docs_dev.json
Original file line number Diff line number Diff line change
Expand Up @@ -939,6 +939,10 @@
"page": "Overview",
"url": "overview"
},
{
"page": "Official Extensions",
"url": "official_extensions"
},
{
"page": "Working with Extensions",
"url": "working_with_extensions"
Expand Down
52 changes: 26 additions & 26 deletions css/main.scss
Original file line number Diff line number Diff line change
Expand Up @@ -1024,6 +1024,32 @@ body.documentation{
nav.mobile{
display: none;
}
span.github{
vertical-align: 1px;
display: inline-block;
background: #D9D9D9;
height: 17px;
line-height: 17px;
padding: 0 5px;
border-radius: 50px;
font-size: 10px;
color: black;
margin-left: 2px;
font-family: "SuisseIntl-Medium";
transition: background .2s;
&::after{
content: "";
background-image: url("data:image/svg+xml,%3Csvg width='10' height='10' viewBox='0 0 10 10' fill='none' xmlns='http://www.w3.org/2000/svg'%3E%3Cpath d='m4.625 2.373.75-.751a2.123 2.123 0 1 1 3.003 3.003l-.75.75M5.374 7.627l-.75.751a2.123 2.123 0 0 1-3.003-3.003l.75-.75m.751 2.252 3.754-3.754' stroke='%23000' stroke-linecap='round' stroke-linejoin='round'/%3E%3C/svg%3E");
display: inline-block;
width: 10px;
height: 10px;
margin-left: 5px;
vertical-align: -1px;
}
&:hover{
background: #78A6FF;
}
}
main .wrap{
width: calc(100% - 252px);
margin-left: 252px;
Expand All @@ -1043,32 +1069,6 @@ body.documentation{
margin-bottom: -4px;
margin-top: 0px;
}}
table span.git{
vertical-align: 1px;
display: inline-block;
background: #D9D9D9;
height: 17px;
line-height: 17px;
padding: 0 5px;
border-radius: 50px;
font-size: 10px;
color: black;
margin-left: 2px;
font-family: "SuisseIntl-Medium";
transition: background .2s;
&::after{
content: "";
background-image: url("data:image/svg+xml,%3Csvg width='10' height='10' viewBox='0 0 10 10' fill='none' xmlns='http://www.w3.org/2000/svg'%3E%3Cpath d='m4.625 2.373.75-.751a2.123 2.123 0 1 1 3.003 3.003l-.75.75M5.374 7.627l-.75.751a2.123 2.123 0 0 1-3.003-3.003l.75-.75m.751 2.252 3.754-3.754' stroke='%23000' stroke-linecap='round' stroke-linejoin='round'/%3E%3C/svg%3E");
display: inline-block;
width: 10px;
height: 10px;
margin-left: 5px;
vertical-align: -1px;
}
&:hover{
background: #78A6FF;
}
}
/*#all-available-extensions td > a:first-of-type{
margin-right: 8px;
}*/
Expand Down
68 changes: 35 additions & 33 deletions docs/api/wasm/extensions.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,74 +3,76 @@ layout: docu
title: Extensions
---

DuckDB-Wasm (dynamic) extension loading is modeled after regular DuckDB extension loading, with a few relevant differences due to the difference in platform.
DuckDB-Wasm's (dynamic) extension loading is modeled after the regular DuckDB's extension loading, with a few relevant differences due to the difference in platform.

### Format

Extensions in DuckDB are binaries to be dynamically loaded via dlopen. A cryptographical signature is appended to the binary.
Extensions in DuckDB-Wasm are a regular Wasm file to be dynamically loaded via Emscripten's dlopen. A cryptographical signature is appended to the Wasm file as a WebAssembly custom section called `duckdb_signature`.
This ensures the file remanins a valid WebAssembly file. Currently we require this custom section to be the last one, but this can be potentially relaxed in the future.
Extensions in DuckDB are binaries to be dynamically loaded via `dlopen`. A cryptographical signature is appended to the binary.
An extension in DuckDB-Wasm is a regular Wasm file to be dynamically loaded via Emscripten's dlopen. A cryptographical signature is appended to the Wasm file as a WebAssembly custom section called `duckdb_signature`.
This ensures the file remains a valid WebAssembly file.

> Currently we require this custom section to be the last one, but this can be potentially relaxed in the future.
### INSTALL and LOAD

INSTALL semantic in native embeddings of DuckDB is to fetch, decompress from gzip and store data in local disk.
LOAD semantic in native embeddings of DuckDB is to (optionally) perform signature checks AND dynamic load the binary with the main DuckDB binary.
The `INSTALL` semantic in native embeddings of DuckDB is to fetch, decompress from `gzip` and store data in local disk.
The `LOAD` semantic in native embeddings of DuckDB is to (optionally) perform signature checks *and* dynamic load the binary with the main DuckDB binary.

In DuckDB-Wasm, INSTALL is a no-op given there is no durable cross-session storage. LOAD will fetch (and decompress on the fly), perform signature checks *and* dynamically load via the Emscripten implementation of dlopen.
In DuckDB-Wasm, `INSTALL` is a no-op given there is no durable cross-session storage. The `LOAD` operation will fetch (and decompress on the fly), perform signature checks *and* dynamically load via the Emscripten implementation of dlopen.

### Autoloading

Autoloading, so the possibility for DuckDB to add extension functionality on-the-fly, is enabled by default in DuckDB-Wasm.
[Autoloading](../../extensions/overview), i.e., the possibility for DuckDB to add extension functionality on-the-fly, is enabled by default in DuckDB-Wasm.

### List of Officially Available Extensions

| Extension name | Description | Aliases |
|---|-----|--|
| autocomplete | Adds support for autocomplete in the shell | |
| [excel](../../extensions/excel) | Adds support for Excel-like format strings | |
| [fts](../../extensions/full_text_search) | Adds support for Full-Text Search Indexes | |
| icu | Adds support for time zones and collations using the ICU library | |
| inet | Adds support for IP-related data types and functions | |
| [json](../../extensions/json) | Adds support for JSON operations | |
| parquet | Adds support for reading and writing parquet files | |
| [sqlite_scanner](../../extensions/sqlite_scanner) [<span class="git">GitHub</span>](https://github.com/duckdblabs/sqlite_scanner) | Adds support for reading SQLite database files | sqlite, sqlite3 |
| sqlsmit | | |
| [substrait](../../extensions/substrait) [<span class="git">GitHub</span>](https://github.com/duckdblabs/substrait) | Adds support for the Substrait integration | |
| tpcds | Adds TPC-DS data generation and query support | |
| tpch | Adds TPC-H data generation and query support | |

WebAssembly is basically an additional platform, and there might be platform specific limitations that make some extensions not able to match their native capabilities or to perform them in a different way. We will document here relevant differences for DuckDB-hosted extensions.
| autocomplete | Adds support for autocomplete in the shell | |
| [excel](../../extensions/excel) | Adds support for Excel-like format strings | |
| [fts](../../extensions/full_text_search) | Adds support for Full-Text Search Indexes | |
| icu | Adds support for time zones and collations using the ICU library | |
| inet | Adds support for IP-related data types and functions | |
| [json](../../extensions/json) | Adds support for JSON operations | |
| parquet | Adds support for reading and writing parquet files | |
| [sqlite_scanner](../../extensions/sqlite_scanner) [<span class="github">GitHub</span>](https://github.com/duckdblabs/sqlite_scanner) | Adds support for reading SQLite database files | sqlite, sqlite3 |
| sqlsmit | | |
| [substrait](../../extensions/substrait) [<span class="github">GitHub</span>](https://github.com/duckdblabs/substrait) | Adds support for the Substrait integration | |
| tpcds | Adds TPC-DS data generation and query support | |
| tpch | Adds TPC-H data generation and query support | |

WebAssembly is basically an additional platform, and there might be platform-specific limitations that make some extensions not able to match their native capabilities or to perform them in a different way. We will document here relevant differences for DuckDB-hosted extensions.

#### HTTPFS

HTTPFS extension is, at the moment, not available in DuckDB-Wasm. Https protocol capabilities needs to go through an additional layer, the browser, that adds both differences and some restrictions to what's doable from native.
The HTTPFS extension is, at the moment, not available in DuckDB-Wasm. Https protocol capabilities needs to go through an additional layer, the browser, which adds both differences and some restrictions to what is doable from native.

### Extension signing
### Extension Signing

As with regular DuckDB extensions, DuckDB-Wasm extension are by default checked on LOAD to verify the signature confirm the extension has not been tampered with.
As with regular DuckDB extensions, DuckDB-Wasm extension are by default checked on `LOAD` to verify the signature confirm the extension has not been tampered with.
Extension signature verification can be disabled via a configuration option.
Signing is a property of the binary itself, so copying a DuckDB extension (say to serve it from a different location) will still keep a valid signature (for example for local development).
Signing is a property of the binary itself, so copying a DuckDB extension (say to serve it from a different location) will still keep a valid signature (e.g., for local development).

### Fetching DuckDB-Wasm extensions
### Fetching DuckDB-Wasm Extensions

DuckDB official extension are served at extensions.duckdb.org, and this is also the default value for the `default_extension_repository` option.
On installing extensions, a relevant URL will be built that will look like `extensions.duckdb.org/$duckdb_version_hash/$duckdb_platform/$name.duckdb_extension.gz`.
Official DuckDB extensions are served at `extensions.duckdb.org`, and this is also the default value for the `default_extension_repository` option.
When installing extensions, a relevant URL will be built that will look like `extensions.duckdb.org/$duckdb_version_hash/$duckdb_platform/$name.duckdb_extension.gz`.

DuckDB-Wasm extension are fetched only on load, and the URL will look like: `extensions.duckdb.org/duckdb-wasm/$duckdb_version_hash/$duckdb_platform/$name.duckdb_extension.wasm`.

Note that an additional `duckdb-wasm` is added to the folder structure, and the file is served as a `.wasm` file.

DuckDB-Wasm extension are served pre-compressed using brotli compression. While fetched from a browser, extensions will be transparently uncompressed. If you want to fetch duckdb-wasm extension manually, you can use `curl --compress extensions.duckdb.org/......../icu.duckdb_extension.wasm`.
DuckDB-Wasm extensions are served pre-compressed using Brotli compression. While fetched from a browser, extensions will be transparently uncompressed. If you want to fetch the `duckdb-wasm` extension manually, you can use `curl --compress extensions.duckdb.org/<...>/icu.duckdb_extension.wasm`.

### Serving extension from a third party repository
### Serving Extensions from a Third-Party Repository

As with regular DuckDB, if you use `SET custom_extension_repository = some.url.com`, subsequent loads will be attempted at `some.url.com/duckdb-wasm/$duckdb_version_hash/$duckdb_platform/$name.duckdb_extension.wasm`.

Note that GET requests on the extensions needs to be CORS enabled for a browser to allow the connection.
Note that GET requests on the extensions needs to be [CORS enabled](https://www.w3.org/wiki/CORS_Enabled) for a browser to allow the connection.

### Tooling

Both extensions and the deployed DuckDB have been compiled using Emscripten 3.1.45.
Both DuckDB-Wasm and its extensions have been compiled using latest packaged Emscripten toolchain.

<!-- markdownlint-disable-next-line -->
{% include iframe.html src="https://shell.duckdb.org" %}
6 changes: 2 additions & 4 deletions docs/extensions/autocomplete.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,14 +2,14 @@
layout: docu
title: AutoComplete
---

This extension adds supports for autocomplete.

| Function | Description |
|:----------------------------------------|:-------------------------------------------------------|
| `sql_auto_complete(`*`query_string`*`)` | Attempts autocompletion on the given *`query_string`*. |


## Example:
## Example

```sql
SELECT * FROM sql_auto_complete('SEL');
Expand Down Expand Up @@ -39,5 +39,3 @@ Returns:
| DEALLOCATE | 0 |
| UPDATE | 0 |
| DROP | 0 |


2 changes: 1 addition & 1 deletion docs/extensions/full_text_search.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
layout: docu
title: Full Text Search
---

Full Text Search is an extension to DuckDB that allows for search through strings, similar to SQLite's FTS5 extension.

## API
Expand Down Expand Up @@ -70,7 +71,6 @@ Reduces words to their base. Used internally by the extension.
|input_string|`VARCHAR`|The column or constant to be stemmed|
|stemmer|`VARCHAR`|The type of stemmer to be used. One of `'arabic'`, `'basque'`, `'catalan'`, `'danish'`, `'dutch'`, `'english'`, `'finnish'`, `'french'`, `'german'`, `'greek'`, `'hindi'`, `'hungarian'`, `'indonesian'`, `'irish'`, `'italian'`, `'lithuanian'`, `'nepali'`, `'norwegian'`, `'porter'`, `'portuguese'`, `'romanian'`, `'russian'`, `'serbian'`, `'spanish'`, `'swedish'`, `'tamil'`, `'turkish'`, or `'none'` if no stemming is to be used.|


## Example Usage

```sql
Expand Down
8 changes: 4 additions & 4 deletions docs/extensions/httpfs.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,14 +3,14 @@ layout: docu
title: httpfs
---

The __httpfs__ extension is a loadable extension implementing a file system that allows reading remote/writing remote files. For plain HTTP(S), only file reading is supported. For object storage using the S3 API, the __httpfs__ extension supports reading/writing/globbing files.
The `httpfs` extension is a loadable extension implementing a file system that allows reading remote/writing remote files. For plain HTTP(S), only file reading is supported. For object storage using the S3 API, the `httpfs` extension supports reading/writing/globbing files.

Some clients come prebundled with this extension, in which case it's not necessary to first install or even load the extension.
Depending on the client you use, no action may be required, or you might have to `INSTALL httpfs` on first use and use `LOAD httpfs` at the start of every session.

## HTTP(S)

With the __httpfs__ extension, it is possible to directly query files over the HTTP(S) protocol. This currently works for CSV, JSON, and Parquet files.
With the `httpfs` extension, it is possible to directly query files over the HTTP(S) protocol. This currently works for CSV, JSON, and Parquet files.

```sql
SELECT * FROM 'https://domain.tld/file.extension';
Expand Down Expand Up @@ -40,11 +40,11 @@ SELECT * FROM parquet_scan(['https://domain.tld/file1.parquet', 'https://domain.

## S3

The __httpfs__ extension supports reading/writing/globbing files on object storage servers using the S3 API.
The `httpfs` extension supports reading/writing/globbing files on object storage servers using the S3 API.

### Requirements

The __httpfs__ filesystem is tested with [AWS S3](https://aws.amazon.com/s3/), [Minio](https://min.io/), [Google cloud](https://cloud.google.com/storage/docs/interoperability), and [lakeFS](https://docs.lakefs.io/integrations/duckdb.html). Other services that implement the S3 API should also work, but not all features may be supported. Below is a list of which parts of the S3 API are required for each __httpfs__ feature.
The `httpfs` filesystem is tested with [AWS S3](https://aws.amazon.com/s3/), [Minio](https://min.io/), [Google cloud](https://cloud.google.com/storage/docs/interoperability), and [lakeFS](https://docs.lakefs.io/integrations/duckdb.html). Other services that implement the S3 API should also work, but not all features may be supported. Below is a list of which parts of the S3 API are required for each `httpfs` feature.

| Feature | Required S3 API features |
|:---|:---|
Expand Down
9 changes: 7 additions & 2 deletions docs/extensions/iceberg.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,11 @@
layout: docu
title: Iceberg
---
The [__iceberg__ extension](https://github.com/duckdblabs/duckdb_iceberg) is a loadable extension that implements support for the [Apache Iceberg format](https://iceberg.apache.org/).

> This extension currently only works on main branch of DuckDB.
The `iceberg` extension is a loadable extension that implements support for the [Apache Iceberg format](https://iceberg.apache.org/).

> This extension currently only works on the `main` branch of DuckDB (bleeding edge releases).
## GitHub Repository

[<span class="github">GitHub</span>](https://github.com/duckdblabs/postgres_scanner)
3 changes: 2 additions & 1 deletion docs/extensions/json.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,8 @@
layout: docu
title: JSON
---
The __json__ extension is a loadable extension that implements SQL functions that are useful for reading values from existing JSON, and creating new JSON data.

The `json` extension is a loadable extension that implements SQL functions that are useful for reading values from existing JSON, and creating new JSON data.

## JSON Type

Expand Down
Loading

0 comments on commit e55399f

Please sign in to comment.